Created: 2014-05-18 21:30
Updated: 2014-05-18 22:36


Reliable Transmission and Routing

CSEE4119 Computer Networks, Spring 2014

Programming Assignment #2

Assignment Description: http://www.columbia.edu/~sy2515/2014CSEE4119-ProgrammingAssignment2.pdf

Shuo Yang, sy2515

As this README is not organized in order, so if you encounter something which you can’t fully understand, please refer the following parts in which a detailed discussion for it should be appearing.

a. A brief description of your code

  • high-level structure

In this program, I have three main threads to deal with the outgoing and incoming UDP message, and updating the DV table.

  1. The first one is "listener", which is responsible for listening for the incoming UDP message (four types: operation related, ROUTE UPDATE related, TRANSFER related, and flooding related which will be discussed in details later on). If this message is a ROUTE UPDATE message, then it will be sent to pool_in, which is the pool of incoming ROUTE UPDATE message. The pool is protected by a locker, which means that only one thread can access to it at one time.

  2. The second one is "sender", which will send ROUTE UPDATE message and COMMAND message ("LINKDOWN" and "LINKUP"). It gets message from pool_out, which is the pool of outgoing messages. of course, it is also protected by a thread locker.

  3. The third thread is the "DVupdate", which is responsible for getting incoming ROUTE UPDATE message from pool_in, performing DV updating, and sending the changed DV to pool_out.

  • DV updating algorithm in dynamic environment

There are basically three cases.

  1. The normal bellman ford algorithm with poison reverse. This is very straightforward. When one client get a DV from one of its neighbors, it will use greedy method to update its distance vector, and if there are changes in its DV, it will then send its DV to all of its neighbors.

  2. The “LINKDOWN” message dissemination. When one link is broken by the related clients, the two clients at the two sides of that link should first know what happened, and update its distance vector. Then, for other clients in the network, they should detect the fake distance to some destinations (meaning the present distance is longer/Inf than its previous distance via exact the same next hop; this phenomenon is brought by the broken link). For the “LINKUP”, its no more than the normal bellman ford algorithm. For this two commands, what we should be careful about is changing the “online mark” when a link is linked down or linked up.

  3. The “CLOSE” command. This command is a little different than the normal “LINKDOWN”, because I find that will not make the state table converge in some non-trivial networks. There are potential errors, because the neighbors of that closed client will sense this at different times, and update DV asynchronously. There will be some strange loops in the DV updating process, which I encountered with probability of about 20%. So I decide to change the behavior of this command, making the whole network more stable. As we will use 3*TIMEOUT to sense the leaving behavior of a client, so after 3*TIMEOUT, we can definitely conclude that the neighbor on the other side of the link has left. So we should change the distance to that neighbor as “Inf”, and disseminate this message to other clients in the network also. I know there is a possibility that when a neighbor doesn’t response in 3*TIMEOUT, maybe it is the link which has some problems, other than that the neighbor has left. But in our project, we don’t have this case. When a link is down, it is operated by the client connected with it, other than naturally down. As a whole, what we should disseminate in the network is that “that neighbor has left”.

  • transmission protocol

First of all, I use "HOST:PORT" as the ID for each client.

I use a self-defined protocol to transfer all the messages:

  1. operation based: initial as “@“; “LINKDOWN” and “LINKUP”

  2. ROUTE UPDATE based: initial as “#”; ROUTE UPDATE message

  3. TRANSFER based: initial as “$”; TRANSFER command, which is used to transfer the file chunk configured in the config file

  4. flooding based: initial as "%"(request) and "^"(reply); “GETGRAPH” and “GETFILE”, which will be discussed in details in part#e


~ operation message format:


origin: from where this command comes

command: "LINKDOWN" and "LINKUP"

parameter: the new weight in the "LINKUP" command; it’s empty for “LINKDOWN” message

~ ROUTE UPDATE message format:

"#SELF@neighbor1-weight1 neighbor2-weight2 ..."

SELF: the “HOST:PORT” ID for the neighbor from which this ROUTE UPDATE message comes

neighborx-weightx: its neighbors with the present shortest distance to them

~ TRANSFER message:


filename@seq: the name of the chunk (chunk1, chunk2 here), and its sequence number (1, 2 here)

origin@time: the client on which this "TRANSFER" command is invoked, meaning the origin of the file chunk, and the time

dest: the "TRANSFER" destination, to which we want to transfer the file chunks

viax@timex: the intermediate clients in the transmission process, and the time on which this packet arrived at that client; every time this packet go through a intermediate client, its name and time will be added to this header of the packet (length-changeable header)

content_of_filechunk: the content of the file chunk, read directly from file chunk

~ flooding message:

1). GETGRAPH and its reply:


-> when this function is called, send this message to all of its neighbors

-> for each neighbor: whenever it first receive a message, duplicate it and send it again on all neighboring links, except the one you received it from

message format:


SELF: the asker for the graph, on which this function is invoked

time: the time this function was invoked

previous_hop: from where this message directly comes from


SELF: the client which will send its graph to the asker

dest: the asker

neighbor-weight;neighbor-weight;...: the graph of mine (SELF)

2). GETFILE (check who has that file) and its reply:


SELF: the asker for the file, on which this function is invoked

time: the time this function was invoked

filename: the name of the requested file

previous_hop: from where this message directly comes from


SELF: the client which has this file

dest: the asker

filename: the name of the requested file

filesize: the size of the requested file

viax: the intermediate clients between me and the asker in the shortest path; adding a new one every time this message arrived at an intermediate client

We should notice that this two kind of message all use “time” attribute, because I want to use it as the flooding message sequence number. In practice, there may be many flooding messages at the same time arriving at one client. We should distingush them by a flooding sequence number. As the time (of course with other flooding information, such as message type, and the request target, say graph or file) is nearly identical here, so I add the to a part of the sequence number in the flooding.

3). GETFILE (to request the actual file) and its reply:


SELF: the asker for the file, on which this function is invoked

dest: the client on which this file is requested

filename: the name of the requested file

chunksize: the size of the chunk of the original file which that client should be responsible for transferring

total_clients: the total number of the clients on which this request is made

seq: the sequence number of this client


SELF: the client who has the file and ready to transfer its chunk to the requester

dest: the requester

filename: the name of the requested file

total_chunks: the number of chunks I will send back (as every time I will only send a small chunk of the portion which I'm responsible)

seq: the sequence number of the present chunk

content: the content of the present chunk

4). reliable transmission:

block-based acknowledge:

send a message (packet), then waiting for a specific time for the "ACK"; if timeout, reply; else, continue when receive a packet, reply "ACK" to that HOST with its sending PORT, then continue processing the packet

b. Details on development environment

Language: Python 2.7

OS: Mac OS X 10.9.2

editor: Sublime Text 2

c. Instructions on how to run your code

./bfclient.py config_file


python bfclient.py config_file

note: the config_file is the configuration file, naming “config.txt” or “config1.txt” or “config2.txt” depending which name you use in the local directory of this client.

d. Sample commands to invoke your code

  • “LINKDOWN 11111”: link down with

  • “LINKUP 11111 2.3:: link up with with a new weight 2.3

  • “SHOWRT”: show the current routing table for this client

  • “CLOSE”: let the present client leave (after 3*TIMEOUT, its neighbors will sense its leaving); if you want to join this client again, simply run the bfclient program again

  • “TRANSFER 11111”: transfer the file chunk configured in the config.txt to the destination “”; the present client will print the next hop in the path to the destination, and the dest node will print the file related information (name, seq#, size, the whole path during the transmission); each separately transferred file chunk will be saved locally at the receiver’s directory, with their previous file names; after the two chunks are separately transferred to the same dest client, they will be combined automatically, and the new file is called “output”

e. Description of an additional functionalities and how they should be executed/tested

First of all, as everything will work only after the table has converged, so I suggest you that you wait for a while for the table to be converged.

  • friendly notification

-> can't linkdown/up non-neighbors and yourself; can't linkdown/up linkdowned/upped neighbors

-> when linkdown/up with an client, notification should appear in the both sides

-> when a new client login, its neighbors (if online now) will notified as that client is linkup with me

-> error detection for the command type and the following parameters for one command; only permitting certain commands, and the parameters should follow the format as described

-> can't transfer file to an unreachable client (for the “TRANSFER” command)

-> can’t transfer file chunk which has not configured in its config file (for the “TRANSFER” command)

  • flooding based “GETGRAPH”

invoke: “GETGRAPH”

I use flooding to send a "getgraph" message to all the clients, and let all the clients send their neighbors with weights to the asker. Of course, the send-back message is also along the shortest path like the file transfer did, but we don't print each send-back message's path (like the file transfer did) because it's kind of trivial. Actually, originally I wanted to plot the graph based on all the client in the network and their neighbors with weights to them, but I find it’s kind of difficult and is not so relevant to our course, so I only print out each client and their neighbors with weights. I think these information can allow us to recover the structure of the whole structure of this network.

  • flooding based “GETFILE”

invoke: “GETFILE test.jpg”

I use flooding message to query a file in the network. First of all, I flood the request to all the clients in the network. If some one has that file, then a confirmation message will be sent back to me. After some path checks (which will be discussed later on), I will request the file from some of the clients who have that file, and I will send this request message to them. Of course, everyone of them will only be responsible for transferring part of the total file, which is more like P2P does. For example, if I have requested a 200KB file from two clients, each of them will responsible for transferring me only 100KB. And for each of them, as we don’t want to make the UDP packet too large, each time they will only send a 20KB chunk to me. This whole process is showed in details after you call the “GETFILE xxx”. You need only read all the information printed on the screen after you call that function to find and request a specific file.

Of course, you can't query no-existing files in the network. If you do, you will be notified “the targeting file does not exist.”

For the GETFILE command, I add another feature. If one node to the source should go via another node, but that node also have the targeting file, then we just drop the first node in the "fileget" message. This is very obvious. For example, if A ask file from B and C, and B and C all have that file, but the topological structure (meaning the shortest path from B and C to A) among them are A<->B<->C, then we don't need C to transfer the same file via B to A, because there will no gain in the total transmission speed and will bring network congestion (wasting the bandwidth between B and C) at the same time (the bottleneck is the bandwidth between A and B, but the link between B and C is also occupied during the transmission). So in this case we don't require file from C.

But if B's shortest path to A and C's shortest to A have overlapping, then we still require file from both of them. Because, for example:



This is the shortest path structure from B, C to A. In this case we still require both B and C for the targetting file if they all have that file. Because, if the bandwidth between A and D is larger than D-B and D-C, then we will get a higher transmission speed totally. Even if the bandwidth of A-D is smaller and is the bottleneck in the total structure, there will not be lost totally. So this mechanism is acceptable.

After we decide from which we ask for the file, we simply ask file of equal chunk size from all of them, because only under the shortest path structure we can't simply decide who should send me more/less. There is no need and meanings to do this because what we do in this project are only some concepts' implementation, but not actually things, and the practical things should be more complicated than what we can simulate in this project.

  • data concurrency

all kinds of lockers are used to perform data concurrency issues (lock_in for pool_in, lock_out for pool_out, lock_dv for the distance vector).

  • command history

invoke: “history”

command line history; it will list 10 most recent commands (if not yet up to 10, then all of them); use the "history.log" to record all the history commands in local directory

  • help

invoke: “help” or “help LINKUP”

command line help; all helps or specific command help; if the command entered can’t be recognized, you will be notified

  • brute quit

gracefully deal with Ctrl+C and errors in commands and their parameters

  • reliable file transfer (although limited)

As in the file transfer (GETFILE) command, there is possibility that the chunk will lost due to network congestion, so I would like to add the ACK mechanism to avoid this. This is a kind of reliable transfer relative to the original pure UDP protocol. As we can assume the transfer is basically reliable, I don't implement the checksum (which actually can be done based on the UDP protocol), because then we should use some trivial methods to test them. But in the GETFILE command, there is a practical demand to perform some reliable checks, because otherwise we will lose some packets in the transferring process, so I add the ACK to some kind of promising the reliable file transfer transmission. We can use "TEST_ACK" to test this, and check the final file with/without this reliable transmission mechanism. (the TEST_ACK flag is in the 12th line of the bfclient.py source code; when it is set to 1, we can get all the transferred file chunks; but when it is set to 0 meaning no reliable transmission is promised, there will be lost packets/chunks during the transmission process, and the final combined file will be smaller and incomplete, meaning low-resolution for the test.jpg)

So actually the "GETFILE" command provides us a convenient method to detect the reliable transmission mechanism.

f. Limitations

  1. prints are not ordered due to being called on different threads

  2. searches exact file name only

g. Other notes

  1. I always use the shortest path for transmission, both for small message and for file chunks

  2. for the newly login client, it is equal to linkup with all of its neighbors with the weight configured in the config.txt; so if there is a link between that client and one of its neighbors whose weight has been updated due to a “LINKUP” message, that old weight will be updated again with the new weight configured in config file of this client

  3. as we use the socket function to get the IP of this local machine (and bind the socket with that address), so in the testing process, for the config.txt please use the actual IP address other than ""

  4. please allocate port# in range of 0-65535, and exclude those already occupied ones

  5. as different clients will generate the “history.log” file with the same name, if you want to test, please create one sub-folder for each client. for example, for client1, create folder as “./client1”, etc.

Cookies help us deliver our services. By using our services, you agree to our use of cookies Learn more