In the seventh week I continued writing unit tests for my team configuration generator. The unit tests are now covering a large part of the functionality of the two generators.
At my mentor’s suggestion I started learning about the protocol that Teamshare is going to use for data transfers, Peer-to-Peer Streaming Peer Protocol (PPSPP). I will briefly introduce the protocol in the remainder of the post.
PPSPP is a protocol for disseminating the same content to a group of interested parties in a streaming fashion. The protocol supports both pre-recorded and live data transfer. In contrast to other peer-to-peer protocols, it has been designed to provide shorter time-till-playback, and to prevent disruption of the streams by malicious peers. In my opinion, the most interesting parts of PPSPP are the chunk addressing schemes and the content integrity protection.
Regarding the chunk addressing schemes, PPSPP uses start-end ranges and bin numbers. As the name suggests, the start-end range identifies chunks by the specification of the beginning and ending chunk. The bin numbers is a novel method of addressing chunks in which a binary interval of data is addressed by a single integer. This reduces the amount of data to be recorded by every peer.
For content integrity protection, PPSPP uses the Merkle Hash Tree scheme for static transfers, and an Unified Merkle Hash Tree scheme which adds a public key for verification. The content is identified by a single cryptographic hash, the root hash of a Merkle hash tree, calculated recursively from the content.In contrast with BitTorrent, which needs all the chunk hashes before it can start the download, PPSPP needs only a part of them, which leads to a limited overhead, especially for small sized chunks.
For more details, feel free to read the IETF draft at this webpage http://tools.ietf.org/html/draft-ietf-ppsp-peer-protocol-07.
In my fifth week, I had to deal with some bugs from my filesystem event simulator that I have worked on. I had a tough time dealing with a particular bug due to the fact that it was rather hard to detect. The bug was rather rare, occurring only after two copy operations, followed by two deletions of the same copied file/directory from the destination directory. The second delete operation could choose a nonexistent file/directory due to a faulty check during the copy process, that led to overwriting by the system, but the list of contents of the destination directory contained two different files/directories.
I have mentioned in my previous post that I used cp, mv and rm commands for the operations. I believed that they worked on Windows, because I tested them on my Windows, but after testing it on another Windows I noticed they don’t. I have solved this bug by using the corresponding commands for Windows.
After solving all the bugs I started working with Java by writing small programs to better understand the language. I have also continued reading the existing code from Teamshare. I also performed unit testing with the help of the Python module pyunit. I have so far tested the user configuration file generator that I worked on in the first weeks.
I will continue unit testing for the the team configuration file generator and the filesystem event simulator.
During the third week I have fixed a series of bugs from the user and team configuration file generators and I have started understanding the code behind Teamshare. This process is slow due to the fact that it is mainly written in Java, which is a new language to me, but basic knowledge of XML and Maven is also necessary. In spite of this, I am starting to understand how the code works and I feel I have learned quite a bit.
In the fourth week, besides continuing my code understanding, I started work on a script that simulates filesystem events. These events consist in creating, copying, deleting, moving, removing or renaming files and directories within a given directory. After solving a few problems with the copy event, all the other events were easy to implement and fix.
Using the python library shutil, on the event simulator, in order to perform the copy, move and remove actions, I encountered a rather annoying problem. After implementing the copy functions, I noticed that the program crashed everytime it should have copied a file or a folder. After inspecting the problem, I found out that the copytree function from shutil requires the destination folder must not already exist. After seeking a solution to the problem, I resorted to using os.system to run external commands (cp, mv, rm).
In the following weeks I will perform unit testing on the work I have done and review my event simulator for bugs and improvements.
My name is Victor Ciurel and I am working this summer on the Teamshare project, under the supervision and guidance of my mentor, Adriana Draghici. Teamshare is responsible with distributed file management in Teamwork, which is an easy to use, portable
system for team management.
My goal in this project is to implement a testing and benchmarking service for the decentralized file sharing system.
In my first week, I learned about the history and development of Teamshare and Teamwork. I began working on the project in a surprising way by solving incompatibilities between the tools and technologies used and the operating system on my laptop. I read the documentation to better understand the design and conventions used by Teamshare. After documenting about JSON and its implementation in Python, I started working on my first Python script for generating random user data and writing it in a JSON format file.
In my second week, I finished my script for generating random user data after many modifications. I have continued my documentation on the technologies needed for this project and I started working on a script for generating random team data and I am still working on small problems.
Adriana and I have decided on a short-term workflow for me. After finishing the random team data generator, I will skim through the existing Teamshare code and I will work on a script in Python, that will modify the users and teams data files to simulate real modifications.
Although I have some catching up to do on my work, I feel very motivated and eager to learn more and to work on my tasks.