1. FTP can be implemented using either TCP or UDP. Discuss the advantages and disadvantages of each protocol when implementing FTP. Which protocol does FTP use?


    To answer the second part first: FTP uses TCP because the spec says it should (File Transfer Functions, sec. 4, and Connection Establishment, sec. 8, in File Transfer Protocol (FTP), RFC 959).

    The answers, approximately verbatim, and in no particular order:

    1. The advantages of TCP when implementing FTP are,

      • Since the TCP is a connection-oriented protocol reliable data transformation takes place.1
      • TCP provides interoperability when implementing FTP.2
      • It operates independently3 and enables internetworking4
      • While transmitting if the dta is lost it is resent to the destination until it receives an acknowledgement
      • There is no duplication of data in TCP

      The disadvantages of TCP when implementint FTP are,

      • It is costly to set up a TCP connection
      • The time consumption for this protocol is more5
      • It is intricate to set up and manage6

      The advantages of UDP wehen implementing FTP are,

      • Since UDP is a connectionless oriented it has less frame work7
      • It uses less processing time when compared to TCP8
      • It has less latency9 and is application flexible10

      The disadvantages of UDP when implementing FTP are,

      • There is no reliability whether the data is tranmitted or not11
      • There is duplication of data in UDP
      • There is no retranmission of data if it is lost
      • There is lack of communication between the server and the client.12
      Hence the FTP protocol uses the TCP protocol because there is reliability because the data sent gets delivered accurately at the destination.

      1: Transformation? Perhaps you meant “transmission”?

      2: What does “interoperability” mean? Are you saying that UDP wouldn't provide interoperability?

      3: Independently of what?

      4: Doesn't UDP establish internetworking too?

      5: More than what?

      6: What are some of the intricacies? What kind of management? Is this at the end-points or in the network?

      7: Less framework than what? Why is this important to FTP?

      8: Good. You compared X to something, and included what X is being compared to.

      9: Less latency than what?

      10: What is application flexibility? Why is it important to FTP?

      11: How do you have reliability when not transmitting data?

      12: What kind of communication is lacking? Does this mean communication is unreliable?

    2. UDP is a connectionless protocol which means it sends data immediately. That makes it a fast protocol.13 So, it is used with applications that require fast transmission of data. For instance, it is used for DNS, DHCP, and real time communication. Some UDP disadvantages are reliability which means it is possible to not receive data. It also sends data not ordered and data reaches in random order. (Vivek Gite, Cyberciti.biz, 15 May 2007. Web. 18 Sep. 2012.)

      On the other hand, TCP is a connection oriented protocol and that means it tests and checks connection14 then start sending data, and it guarantees that all data will reach the destination. It also sends data ordered. For that, TCP is a reliable protocol, and it corrects mistakes and re-sends corrupted data again. So it is used for HTTPS, SMTP, and FTP. However, that makes TCP slower than UDP when sending data. (Erik Rodriguez, Skullbox.net, 26 July 2011. Web. 18 Sep. 2012.)

      FTP uses TCP protocol.

      13: Fast by what measure? Throughput? Delay? Low host overhead?

      14: Test and checks the connection in what way? Does it do resource allocation?

      • TCP implementation:
        • Advantages:
          • Data is guaranteed to reach the end point, will reach in predicted time,15 lack of duplication.
          • All work needed to be done with sending the data is done for the user.16
          • Automatic breakdown of data into packets.
        • Disadvantages:
          • Any problems in the OS will cause problems while using the network.17
          • Cannot be used to broadcast and multicast connections18
      • UDP implementation:
        • Advantages:
          • Can broadcast and multicast connections
          • User is not restricted to the connection based communication model
          • Faster than TCP19
      • FTP only uses TCP.
      Sources: laynetworks.com — comparative analysis

      15: Is this true? What kind of predictions are made? By what mechanism are predictions made?

      16: Isn't that true for UDP too? Did you mean “reliably sending the data”?

      17: Isn't that true for any network protocol?

      18: Is multicast or broadcast for FTP?

      19: Faster in what sense? Faster raw throughput? Faster usable throughput? Under what conditions?

    3. The main difference between TCP and UDP is that TCP connects and remains connected to the destination of the data while UDP does not; therefore, TCP is more reliable protocol while UDP is a faster one. Since transferring files is a function that ideally yields success on the first attempt, a more reliable protocol is typically used. FTP uses TCP to help ensure this success. Research Source: http://www.bleepingcomputer.com/tutorials/TCP-and-UDP-ports-explained/
      Section: “The two Internet workhorses: UDP and TCP”

      TCP stands for Transmission Control Protocol. Using this method, the computer sending the data connects directly to the computer it is sending the data it to, and stays connected for the duration of the transfer. With this method, the two computers can guarantee that the data has arrived safely and correctly, and then they disconnect the connection. This method of transferring data tends to be quicker and more reliable, but puts a higher load on the computer as it has to monitor the connection and the data going across it. A real life comparison to this method would be to pick up the phone and call a friend. You have a conversation and when it is over, you both hang up, releasing the connection.

      UDP stands for User Datagram Protocol. Using this method, the computer sending the data packages the information into a nice little package and releases it into the network with the hopes that it will get to the right place. What this means is that UDP does not connect directly to the receiving computer like TCP does, but rather sends the data out and relies on the devices in between the sending computer and the receiving computer to get the data where it is supposed to go properly. This method of transmission does not provide any guarantee that the data you send will ever reach its destination. On the other hand, this method of transmission has a very low overhead and is therefore very popular to use for services that are not that important to work on the first try. A comparison you can use for this method is the plain old US Postal Service. You place your mail in the mailbox and hope the Postal Service will get it to the proper location. Most of the time they do, but sometimes it gets lost along the way.

      …FTP servers use TCP ports 20 and 21 to send and receive information, so you won't have any conflicts with the web server running on TCP port 80.20

      20: Ugh, don't do this. Cutting and pasting is complete waste of time: you learn nothing from it, and I can look it up from the citations if I need to.

    4. ## FTP / TCP
      • Advantages:
        • Connection-oriented protocol: any connection has to be set up before transferring data to the other side by using the 3-way handshake system, as a result, TCP has the ability to establish a connection (HTTP://www.computer-networks.blurtit.com, The Advantages Of Using TCP Over UDP).
        • Reliability: FTP uses the reliable TCP in order to guarantee that data (files) are sent and received without loss of data and without any corruption (HTTP://tcpipguide.com, FTP overview).
        • Flexibility by supporting multiple data types and file types21 (HTTP://tcpipguide.com, FTP overview).
        • Error correction by detecting errors and correcting them and resending segments when it's needed (HTTP://www.en.wikipedia.org, TCP).
      • Disadvantages:
      ## FTP / UDP
      • Advantages:
      • Disadvantages:
        • Connectionless protocol: UDP uses a simple transmission model, so UDP does not establish a connection, and it just sends the data (segments) directly without set up connection before transferring data to the other side (HTTP:www.en.wikipedia.org, UDP).
        • Unreliability.
      ## FTP primary uses TCP.

      21: Is it the case that UDP doesn't support multiple data types and file types? What are data and file types?

    5. Advantages of TCP when implementing FTP are:

      • It provides reliability, TCP gives the receiver a complete copy of the file. Data packets that are lost are resent again, if the connection fails then the data is re-requestred, thus making sure that data is received at the other end. Data arrives in order and that there are no duplicates.
      • It provides congestion control, the mechanism that throttles the sender when one or more links between sender and receiver becomes excessively congested.
      • It provides flow control.
      • This makes application developers work easier since everything is implemented at network level.

      Disadvantages of TCP when implementing FTP are

      • The extra overhead makes the file transmission slower.
      • It is expensive in terms of overhead at execution time, since it needs to provide reliability.
      • This does not support broadcast and multicast file transfers.22

      pAdvantages of UDP when implementing FTP are

      • It is much faster than TCP because it doesn't need to establish any connection between end points.
      • It is less expensive when compared to TCP in terms of overhead at execution time since it has only 8 bytes of overhead.
      • This supports broadcast and multicast file transfers.23

      Disadvantages of UDP when implementing FTP are

      • This doesn't provide reliability i.e. does not guarantee the data transfer, disordered packets, so this makes application developers work tedious to implement it in application level.
      • UDP's best effort service does not protect against datagram duplication, i.e., an application may receive multiple copies of the same UDP datagram.
      FTP uses TCP protocol
      Reference: Textbooks and various sites.

      22: Is this something important to FTP?

      23: If UDP doesn't establish connections between end-points, how can it support broadcast or multicast? What does “support” mean?

      • The advantage of using TCP when implementing FTP is that it provides a reliable connection guarantees correct sequencing of IP datagrams,24 guarantees delivery, guarantees no duplication. It can also establish connections between two different types of computers and servers25 and has scalable client/server architecture.26 The disadvantage of using TCP is it is complex to set up and manage.27
      • The advantages of using UDP when implementing FTP is that there is lower latency, application flexibility, and no connection state. The disadvantages are it does not maintain a reliable connection, does not preserve sequences,28 does not guarantee delivery or protect against duplication. Is best for minimum protocol intervention.
      • FTP uses TCP/IP
      Source: Notes from Networking and Internet Technologies class at Rutgers.

      24: Is that what the application's sending, IP datagrams?

      25: Can't UDP, or IP for that matter, do the same thing?

      26: Scalable in what sense?

      27: Complex for which side? Complex in what way?

      28: Does not preserve sequences of what?

    6. The applications associated with FTP29 require all the data to be received in correct order. It is fairly simple that TCP provides this service and that's why FTP uses TCP, and not UDP.

      In TCP lost packets are resent and thus it is reliable. It guarantees efficient delivery.30 TCP guarantees 3 things:

      • The data gets there.
      • The data gets there in order.
      • Without duplication.

      It has good throughput on a modem or LAN.31

      Disadvantages of UDP

      A packet may not be delivered in order or may be duplicated. And you get no indication that its been delivered or not unless the listening one says something.

      UDP has no flow control.

      No retransmission if data collides.

      Disadvantages of TCP

      • Extra overhead makes the transmission slower where in file transfer of large files transmission speed is important.
      • Its latency is its downside of TCP. It has to wait for acknowledgments.
      • TCP cannot be used for broadcast or multicast connections.32

      Advantages of UDP while implementing FTP in it.

      • Broadcast or multicast connections are available.33
      • Much faster than TCP.
      • Less expensive.

      29: FTP is the application being considered. What are the applications associated with FTP?

      30: Efficient in what sense? Efficient at the end-points? In the operating system? Over the network?

      31: Why is throughput across a modem interesting?

      32: So? Why is that important to FTP?

      33: Does FTP care at all about broadcast or multicast?

  2. How big would a 5,000 byte file be if it was encoded using base64? Assume 1) lines in the encoded file contains are 80 characters long except the last one which may be shorter, 2) the newline character ('\n') is one character and appears at the end of every line, and 3) the encoded file contains only lines from the original file, there is no extra header or footer information added.


    Base64 encodes a source sequence of three bytes as a result sequence of four bytes by encoding successive groups of six bits from the source sequence as a byte (eight bits) in the result sequence. If necessary, the source sequence is padded (appended at the end) with one or two null (all zero) bytes to make the sequence size in bytes evenly divisible by three, and the result sequence is also padded with a distinguished byte value (which must exists because the source-sequence representation uses only 64 of the possible 256 values available) equal to the number of null bytes appended to the source sequence.

    Source: Base 64 Encoding (section 4) in Base-N Encodings (rfc 4648).

    A 5,000-byte sequence requires one pad byte to be divisible by three, and the Base64-encoded resulting sequence contains (5001/3)*4 + 1 = 6668 + 1 = 6669 bytes. The result sequence divides into ceiling(6669/79) = 85 lines of at most 79 characters, and the total size of the result sequence is 6669 + 85 = 6754 bytes.

    The answers, approximately verbatim, and in no particular order:

    1. Input file size = 5000 bytes

      Base64=bytes + 21 - ((bytes + 2) MOD 3)/3 * 4
      =5000 + 2 - ((5000 + 2) % 3) / 3 * 4
      =6668 bytes = 6.51172 KB2
      The output file size is 6668 bytes = 6.51172 KB.3
      (HTTP://www.obviex.com, How to Calculate the Size of Encrypted Data).

      1: Why “+ 2”? What does it represent? Could it be related to padding? How?

      2: Why translate from exact character units to approximate kilobyte units?

      3: Does this include the newlines added to the encoded result?

    2. The formula for measuring the size of file encrypted4 with Base64 is:

      Base64 = (File_size + 25 - ((File_size + 2) MOD 3))/36 * 47

      Base64 = (5000 + 2 - ((5000 + 2) MOD 3))/3 * 4

      The file size will be: 6668 Byte.8

      4: Encoded, not encrypted, although the information is obscured after being run through Base64.

      5: Why 2? What does it represent?

      6: Anything mod 3 will be less than 3, and dividing anything mod 3 by 3 makes it less than 1. What does this value represent?

      7: Is padding included somewhere in this equation? Is that what the + 2 is for?

      8: Where does this calculation account for the newlines added to the encoded result?

    3. Original file size is 5000 bytes.
      So 5000 byte file approximately9 contains 5000 characters.
      Base64 encoding generates a file which is 137%10 more than of original file.
      So encoded file size is 5000*1.37 = 6850 bytes
      So it approximately contains 6850 characters.11
      Number of lines in that encoded file = 6850/80 = 85 lines
      Since, each line in encoded file has 80 characters.1213
      References: multiple sites14

      9: Approximately? Why approximately? Why not exactly?

      10: 137%? How did you get that number?

      11: Again, why approximately?

      12: Given assumption 1 in the question, is it always true that a line has 80 characters?

      13: What's the answer? Is it 85 lines?

      14: This is an unhelpful citation. Where would I go if I wanted to check your answer?

    4. Output size((input_size - 1)/3)*4 + 415
      ((5000 - 1)/3)*4 + 4 = 6669
      final size output_size + (output_size/80)*216
      6669 + (6669/80)*2 = 6835.725 bytes17

      Source http://stackoverflow.com/questions/1533113/calculate-the-size-to-a-base-64-encoded-message ; answer by kanaka.

      15: Why does this formula subtract one from the input size? And what does that final “+ 4” represent?

      16: What does multiplying by 2 represent?

      17: Fractional bytes? There are bits left over? (Note that 5/8 = 0.625 < 0.725 < 0.75 = 6/8, so there are fractional bits too.) Is that how Base64 encoding works?

    5. 4[n/3]
      (4[5000/3])/8018
      (4[1666.67])/8019
      6666.68/8020
      83.3321

      Sources: Computer Networks, pg 702

      18: Is there any padding being described by this equation? Does Base64 do any padding?

      19: Are the added newlines accounted for by this equation?

      20: What units are 6666.68 in?

      21: Units?

    6. Base64 takes 3 bytes at one time and converts them to 4 Base64 characters. A 5,000 byte file would consist of 6666.67 characters after the conversion (5000 bytes/3 bytes per group = 1666.67 groups of 3 bytes.22 1666.67 groups * 4 characters per group = 6666.67 characters). 6667 characters (rounded up)23 will require 84.39 lines in the file (6667/79 characters per line, \n is the 80th character on each line). Final answer: The file will be 85 lines long to sufficiently contain the 5,000 byte file endoded in Base64.24

      Research Source: http://email.about.com/cs/standards/a/base64_encoding.htm, section Base64 to the rescue (explanation and example).

      22: How is it Base64 produces fractional characters?

      23: How are characters rounded up?

      24: But how long is a line?

    7. Total bytes of data = 5000 bytes.

      The base64 converts 3 bytes of data into 4 characters. So 5000 bytes of data is converted into

      \[{5000 \cross 4} \over 3 = 6666.66 = 6667\] characters25

      Given that each line has 80 characters. So clearly number of lines = 6667/80 = 83.3 lines.

      i.e. Total no. of lines = 8426

      25: How does the rounding up go? What extra data is added to produce an integer?

      26: But how long is a line?

    8. Given a 5000-byte file, Base64 convertes 3 bytes of data into 4 characters i.e., 1 byte = (4/3)characters.27 So for 5000 bytes it is = 5000*(4/3) = 6666.667 characters.28 Lines in the encoded file contains are 80 characters i.e., 1 line contains 80 characters.29 Hence, the no. of lines = 6666.667/80 = 83.333 lines.

      27: What kind of unit is characters? How many bits does it have?

      28: Fractional characters? What do they look like?

      29: Is that true? Is that what the problem states? (Hint: no)

  3. What is Zipf's law? How would a file server exploit Zipf's law to improve performance? What would be the expected benefits of exploiting Zpif's law?


    Zipf's law was originally an observation about the relation between frequency and rank ordering in English words: the frequency of the ith most frequently occurring word is proportional to 1/i. If Zipf's law holds, the most frequent word (rank 1) has a frequency that's twice the frequency of the second most frequent word, three times the frequency of the third most frequent word, and so on.

    A server offering a population of objects for which Zipf's law holds can exploit the law by creating a static, two-level cache. A two-level cache is sufficient because Zipf's law divides the population into popular (frequent) and unpopular (infrequent) objects, and the popular objects are much more popular than the unpopular objects. The cache can be static because accessing an unpopular object can be considered a rare event, and moving the object into the popular cache would be a waste of time because it's unlikely to be accessed again soon.

    Source: HTTP://xlinux.nist.gov/dads/HTML/zipfslaw.html

    The answers, approximately verbatim, and in no particular order:

    1. ## Zipfs law:' “Zipfs law is the observation that frequency of occurrence of some evnet (P)”
      (HTTP://www.linkage.rockefeller.edu, introduction to Zipf's law).

      ## A File server can exploit Zipf's law to improve performance by identifying the most popular requested files on itself, and it can also tell how many times these files have been downloaded, modified, or even opened.1 So, the file server can make a prior access to these files and put them on the top of its files’ list. Thus, this process will improve the file server performance because it is going to speed up the access speed to the requested file.

      ## As mentioned previously, Zipf's law will be beneficial to identfy what or where the popularity is and gives it the priority to be accessed; for example, Zipf's law can give a hand of help with search engines, etc.

      1: What in particular about Zipf's law improves performance over the usual cache operation?

    2. [ not answered ]2

      2: Always answer the question. Even a lame answer can earn some points, but no answer can only earn zero points.

    3. Zipf's law states that the probability of occurrence of words or other itmes starts from high occurrence and then reduces off. Thus, many items occur rarely while a few occur very often. In other words, the frequency of occurrence of any word is inversely proportional to its rank in the frequency occurrences table.

      Formula: Px = 1/xa, where Px is the frequency of occurrence of the xth ranked item and a is close to 1.

      A file server would exploit a Zipf's law to improve the performance by enhancing the performance of the cache i.e. this property leads to effective web caches, which contain the most popular objects and typically employ the least frequently used replacement policy due to which the server often achieves higher cache hit rates.

      The expected benefits of exploiting the Zipf's law are:

      • Since the requests are served immediately from the cache, the response time can be significantly faster than contacting the origin server.3
      • Catching conserves bandwidth by avoiding redundant transfers along remote internet links.4
      • The content reaches the users more quickly and they avoid being overloaded themselves by many direct requests.

      3: Is this a property of Zipf's law, or of caching?

      4: Ok, but what does this have to do with Zipf's law?

    4. Zipf's Law (refers to George K. Zipf) describes the incidence of distinct objects5 in special sorts of collections.6 (Aaron Krowne, Planetmath.org. Version 4. Web. 18 Sep 2012.)

      Server exploits Zipf's law can improve sorting and delivering data methods according to popularity and importantly of data.7 As a result, that can decrease access time to contents in the server which saves time and resources.

      5: What does the description say about the incidence of distinct objects? Is there any particular relation described?

      6: What's special about the collections?

      7: Ok, but how might that be possible? what is it about Zipf's law that makes this possible?

    5. Zipf's law: Zipf's law states that the relative probability of a request for the i'th most popular page is inversely proportional to ‘i’. It specifies that popularity objects are ranked according to their popularity, then the probability that the user chooses the ‘m’th item on the list is 1/m.

      File server exploits Zipf's law by searching the request8 based on least occurrence of words by removing highly ranked words.9

      8: Perhaps that phrase should have read “searching for the request”?

      9: If I'm understanding the answer correctly, it's suggestion that commonly occurring words occur in too many results to be useful and should be thrown away in favor of less commonly occurring words. Ok, but what does that have to do with Zipf's law? Zipf's law relates frequency and rank ordering in a particular way, and this answer doesn't exploit that relation.

    6. Zipf's law outlines how often individual objects in a set will occur. The frequency of an object in a set is inversely proportional to its overall frequency.10 A file server could use Zipf's law to stack the cache with more frequently used files.11 This act would increase the speed of transfers using this server.

      Research Source: HTTP://planetmath.org/ZipfsLaw.html, section: Zipf's law (explanation, formulae, graph).

      10: Isn't it more like the probability of occurrence is inversely proportional to the rank?

      11: Doesn't replacement cache do this anyway? How does Zipf's law help make this better?

      1. Zipf's law is the theory that the most common words in the English language start off with a high rate of occurrence which begins to drop as the words become less common.12 This can work with other types of data as well.
      2. A file server could exploit Zipf's law by looking for the most common information13 and patterns14 by repeating how it handles the information15 which will make the process faster.
      3. Exploiting Zipf's law would allow for smoother performance and move similar information faster through the system.
      Sources: Planet math — Zipfs law'

      12: That's all Zipf's law states? That the most common words have a high rate of occurrence? Isn't that tautological?

      13: What kind of information? Information (that is, queries) that comes from the clients, or information (that is files) that comes from the server? Or maybe both.

      14: What kind of patterns?

      15: What does repeating information handling involve?

    7. The most frequently occurring words are rankded in increasing order of their occurrence. So most frequently occuring word is ranked 1.16

      These most frequently occurring words are non-content words and the least frequently occurring words i.e. with high rank have more content in it.17

      The file server exploits Zipf's law to improve performance by “effective caching”. This property provides an important tool in designing architectures of web caching. Zipf's law helps in selecting which objects to cache.18 In this case it uses Zipf's law to select content objects to cache to improve its performance. Popularity of a video file can be calculated using Zipf's law.19

      16: Is this all Zipf's law says?

      17: Is Zipf's law concerned with meaning (word content), or popularity?

      18: How does Zipf's law help? How is it used to select objects for caching?

      19: Is that what Zipf's law is concerned about? Isn't popularity determined by how many times an object is used?

  4. A video file contains 640 × 420 pixel frames with 16 bits of color per pixel and a 50 frames/sec display rate. How much bandwidth is required to stream this file?


    A 640 × 420 frame has 268,800 pixels; at 16 bits of color per pixel, a frame has 4,300,800 bits. 50 frames per second is equivalent to 215,040,000 bits per second. Unencoded video transmission requires a little under a quarter gigabit per second.

    The answers, approximately verbatim, and in no particular order:

    1. The formula of measuring the required bandwidth is:

      Required Bandwidth = Frames/sec × Resolution × Color Depth

      1

      Required Bandwidth = 50 × 640 × 420 × 16 = 215,040,000 Bits/second = 205.07 Megabits/second.

      1: Do the units carry through in this equation?

    2. Pixels/frame = 640*420 = 268800
      Bits/frame = 268800*16 = 4300800
      Bandwidth = 4300800*50 = 215040000 = 215Mbps
      Reference: Text (A Systems approach)
    3. Given

      a video file contains 640*420 pixel frames = 268800 pixels

      Bits of color per pixel = 16

      display rate = 50 frames/sec.

      Now,
      The data in one frameno. of pixels * no. of bits/pixel
      268800*16
      4300800 bits
      Hence,
      The no of bits displayed in one seconddisplay rate * data in one frame
      50*4300800
      215040000 bits = 215 Mbps
      So 215 Mbps bandwidth is required to stream this video file.
    4. A video file with 640 × 420 pixel frames would normally have a size of 268,800 bytes; however, since the file uses 16 bits of color per pixel, this number is doubled (8 bits per byte normally) to 537,600 bytes. The file for this problem requires a fram rate of 50 frames per second, so the bandwidth that is needed to stream this video is 537,600 bytes multipled by 50 frames/second, yielding a final answer of 26,800,000 bytes/sec.

      Source: HTTP://www.pk3.org/Astro/index.htm?astrophoto_vesta_pro.htm
      (Example) The reason of above facts is limited throughput of USB. Vesta Pro is using YUV420 codec, which requires 12 bits per pixel. That means, that, for example, 640x480 pixels frame has size 460800 bytes. For 5 fps video stream it requires 2304000 bytes/s - it is more than throughput of USB. That's why there must be used some compression of video data, which are sent through USB. As my measurements confirmed, the compression is lossy.2

      2: Really, don't do any quoting; just the citation will be enough.

        • 420 × 16 = 6720 bits per frame3
            640/50 = 12.8 seconds of video
              6720 × 50 = 336000 Kilobit per second
                It would take 322.560 Kilobytes, or 0.315 Mbytes
                  Sources: astream.com — streaming bandwidth calculator

                  3: Ok, this was my mistake.

                1. 15KB4 × 10 bits/byte5 = 150 ; 150 × 50 frames/sec = 7500 Kbits/sec

                  http://www.imakenews.com/kin2/e_article000345313.cfm?x=b11,0,2

                  4: From where did 15KB come? What does it represent?

                  5: What does 10 bits/byte represent?

                2. 6
                  Bandwidth * (16 bits) * (50 frames/sec)
                  215 040 000 bits
                  The required bandwidth to stream this file is 215 040 000 bits.

                  6: Do the units carry through to bits?

                3. Number of bits per pixel = 16.

                  Total no. of pixels is 640 × 420 = 268,800.

                  Data in one frame is 268,800 × 16 = 4,300,800 bits.

                  Given display rate is 50 frames/sec.

                  So in one second no of bits displayed = 50 × 4,300,800 = 215,040,000 bits = 215 mbits.

                  So clearly bandwidth required to stream this video is 215 mbps.
              • An FTP does not keep the port 20 data connection open for multiple file transfers. What is the justification for breaking the data connection after every file transfer?


                Some transfer modes used by FTP require that the server close the data connection to indicate end-of-file.

                Reference: Transmission Modes (section 3.4) from File Transfer Protocol (rfc 959).

                The answers, approximately verbatim, and in no particular order:

                1. Closing port 20 after every file transfer is an indication that the current file or data which is being sent was completely transferred. For that, server or receiver can know that the amount of data was reached to its destination.
                2. [picture]

                  • In step 1 client command part 1026, in this case sends a request to server command port, i.e. port 21.
                  • It sends an acknowledgment in step 2 to port 1026.
                  • In step 3 the server initiates the data connection at data port.
                  • Finally the client sends an acknowledgment back.1
                  Since FTP used TCP, it has to send an acknowledgment back after each file transfer. So every time each file transfer is made Step 3 and step 4 are repeated, i.e. after each file transfer the data connection is broke2 and new connection is established.3

                  1: Is this an acknowledgement in response to the server making the data connection, or to something else?

                  2: But why is the connection broken?

                  3: Even if there is no other data being transferred?

                3. The FTP does not keep the port 20 data connection open for multiple file transfers because it breaks the data connection after every file transmission.

                  The FTP client initiates a connection4 with the FTP server on its port 21.

                  Port 21 is where the server is listening for commands issued to it, and in turn, which it will respond to. Hence here the TCP/IP handshake is complete.5

                  At this point, the client begins to listen on its ephemeral port + 1, and sends the PORT N + 1 command to the server on its port 21, i.e., if the ephemeral port in use by the client is 1026, then it would listen on port 1027.

                  Once this is done the data tranfer port (port 20) on the FTP server would initiate a conneciton to the FTP client's ephemeral port plus 1, as indicated above. This is how an active FTP session is conducted by both the client and server.

                  Hence the port 20 data connection for multiple file transfers is not open because it breaks for every single connection after every file is transferred.6

                  4: Which connection? There are two of them: command and data.

                  5: Is this something that FTP does, or does TCP do the handshake?

                  6: That's true, but why does FTP break the connection? That's what the question's asking?

                4. Breaking the data connection after every transfer is necessary to avoid confusion between different connections.7

                  Source: Notes from Networking and Internet Technologies class at Rutgers.

                  7: Which different connections? Between the control and data connections? How many other connections are there?

                5. An FTP closes the port 20 data connection after each file transfer because it avoids confusion between the data connections. It's possible that old data from previous transfers could still be present, and reestablishing a conneciton each time helps avoid one transfer absorbing the old data from a previous one. It is also possible that a system could “lock up” because it is waiting for an end-of-transfer message8 that it might not receive due to “time outs” on the firewall's side.9 It is safer and more acurate for each transfer to have freshly opened ports.

                  Source: HTTP://msmvps.com/blogs/alunj/archive/2009/07/13/1300796.aspx
                  In typical Stream Mode operation, a new data connection is opened and closed for each data transfer, whether that’s an upload, a download, or a directory listing. To avoid confusion between different data connections, and as a recognition of the fact that networks may have old packets shuttling around for some time, these connections need to be distinguishable from one another.
                  Source: HTTP://www.ncftp.com/ncftpd/doc/misc/ftp_and_firewalls.html
                  Even if the client program is planning on ending the session, the FTP requires that the client program send a message ("QUIT") to the server indicating that the connection should be closed, and the server is then required to reply with another message indicating that the session is officially closed. The ramifications are that the client program could then lock up waiting for a reply to a "QUIT" message that the server will not receive since the firewall timed-out the session, unbeknownst to both client and server. The solution for this specific case, which some, but not all, FTP client programs do, is to either place a very short time-out on the reply to the "QUIT" message, or to simply close its end of the FTP session (which violates the FTP protocol, but is de facto behavior and is generally accepted).

                  8: From where does the end-of-transfer message come?

                  9: Firewall? Why is there a firewall involved?

                6. The justification for the FTP breaking the data connection after every file transfer is that FTP uses port 21 for listening to commands, while port 20 I for receiving data files. The connection is broken so that the FTP can receive the commands about the data10 before it receives the data, so there is no need to keep it open.

                  Sources: windosnetworking.com — understanding ftp protocol

                  10: What does breaking the data connection have to do with receiving commands? Aren't they separate connections?

                7. ## There are two FTP modes, and when FTP uses either active or passive mode, there will be a justification for breaking the data connection after every file transfer.

                  ## The Active FTP mode is inefficient way to deal with a multiuser system because if many users made a lot of FTP requests, the system wouldn't be capable to match the all incoming FTP data connections to right users. Consequently, FTP does not keep the port 20 data connection open for multiple file transfers to make sure about the matching the incoming FTP data connection.11 Also firewalls might block connections on this mode.12

                  ## The Passive FTP mode use to solve the firewalls issues that might block some connections. On this mode, the client initiates the data connection from its data port to the specified server data port to avoid blocking connections; as a result, the client opens a new connection port for each connection or every file transfer, so the server cannot deal with all incoming connections. Consequently, FTP does not keep the port 20 data connections open for multiple file transfers to make sure about the matching the incoming FTP data connections.

                  11: But doesn't the command connection have the same problem? And FTP doesn't close the command connection after each command.

                  12: If a firewall blocks connections, how can connections be broken?

                8. An FTP does not keep the port 20 data connection open for multiple file transfer because when the server has completed sending data in a transfer mode then server needs to close the connection to indicate the end of file because server is done transferring requested data.
                  Reference: rfc.13

                  13: Which rfc? There are over 6,000 of them.


      This page last modified on 2012 October 1.