Private:progress-bu-khamsin
2012
April
After spending hours trying to read and understand SDP implementation for Infiniband, I realized that I need to read in depth about BSD Socket and linux implementation of INET socket and different data structure used before I start modifying the code. Now I am reading http://www.cs.unh.edu/cnrg/people/gherrin/linux-net.eps http://www.cookinglinux.org/pub/netdev_docs/net.pdf and chapters from Linux TCP/IP networking for embedded systems / Thomas F. Herbert.
The way PCIe establishes connection with the other side is different than regular socket connection process. Unlike traditional networks, each machine connected to PCIe fabric detects every other machine as a unique device. So for machine A to connect to machine B, machine A has to open machine B's device and write to the BAR register indicating that it want to connect. At the same time machine B has to be watching to the BAR register associated with A to be able to respond to the request. So, to resample socket connection process, once a socket is created and listen() is called, all PCIe devices seen by the machine should be configured to listen, because there is no way to tell from which device the connection is coming. On the client side, since we know to which machine we are connecting, only the device associated with that machine need to be configure when connect() is called.
March
The cards have arrived and installed on two dell machines. They took me two days to install and configure. The cards were preloaded with configuration not suitable for machine to machine communication. One stop system haven't shipped the drivers with the cards. After contacting the support, it turned out that the drivers are under development so the only available drivers are those provided by PLX ( the chip maker). So I contacted PLX support and agreed to give me access to NDA protected configuration document required to program the cards to work as machine to machine interconnect. So I programmed the EEROM of the cards and run the sample applications come with the driver. The initial throughput I got is 1.2 GB/s although the theoretical throughput is 2.5 GB/s. Testing the DMA performance for each machine shows that one of them is two times faster than the other. I moved the card to another lane and starts getting much better speed but still the other machine is slightly faster. This makes me believe that by changing the configuration, the cards are able to achieve up to 2 GB/s as reported by this paper http://sigops.org/sosp/sosp11/workshops/hotpower/03-byrne.pdf.
After I ran the sample applications, I got clearer idea about the driver. I have finished building the API and tested it with a test kernel module I built along with the API. The test model works very well and proved that the API does its expected job.
February
The way PLX_API in the user space interacting with the driver by opening the driver and sending ioctl requests. So to provide similar functionality to kernel space, I have written similar function to the ioctl handler and exported it to the kernel. This function requires an instance of device_object structure as one of its parameter. To make this structure available in the kernel space, I have written and exported another function that get this structure from the driver and makes it available to the kernel.
After finished the API, I have started working on a test module that uses the API to transfer data between two machines. However, I am facing some problems with the hardware register and memory mapping to the user space. As the module is running purely on the kernel space, I couldn't find a way to allocate user space address to be used for the mapping. And even if I managed to do so, all the kernel calls I found require file descriptor as a parameter which can only be obtained when the driver is opened from the user space. I looked for a solution for this problem and found some posts that advice to refere to Infiniband driver. So that is what I am doing right now. I have studied the driver before but didn't go in depth in the memory management part.
January 30
I found out that the PLX API available on the SDK is only accessible by application written for user space and cannot be used for kernel space modules. So I am working right now on writing similar functions for kernel space which requires modification to PLX drivers.
January 27
I find it hard to replace RDMA calls with DMA calls as each API is written with different level of abstraction. So I find my self in a need to have deeper understanding of memory organization and how it is divided. Also I need to know which part of memory is accessible via DMA and the kind of addressing required and how to translate virtual to physical and virtual to PCI address. After I finished reading few chapters from the books below, I am now going through the code again trying to apply what I have learned. I also find the documentation of the RDMA and DMA API not very detailed and doesn't describe the affect of calling the functions, so I need either to go through the implementation files of the API or playing with the sample applications after I receive the hardware.
January 26
- Reading chapters from "LINUX DEVICE DRIVERS" THIRD EDITION Jonathan Corbet, Alessandro, Rubini, and Greg Kroah-Hartman
January 25
- Reading chapters from "Understanding the Linux® Virtual Memory Manager".
January 22
- Successfully isolated and compiled the Infiniband SDP driver:
Finally, I was able to compile the driver without modifying the kernel headers. I installed Centos 5.7 and installed and compiled kernel 2.6.18. Then applied some patches included with the source code of the driver that matches this version of the kernel. After that I moved all the dependencies to the driver directory and modified some Makefiles and some includes in the header files of the driver.
- Studying PLX NTB DMA Drivers
- Reading about DMA programming
January 3
- Study the differences between DMA and RDMA
- Discover how Infiniband SDP is exchanging local memory addresses
January 1
- Trying to prepare the development environment and find the right version of kernel:
Before I start modifying the code of the infiniband SDP implementation, I need to find a way to build the source code smoothly. Building Kernel modules under 2.6 Linux kernel requires downloading and compiling the kernel first. I have tried different linux distributions and install different kernels but the SDP code keeps giving me errors during compilation time. I managed to compile the driver after I modified few headers in the kernel and in the driver code but I don't think this is the right way. I am still trying..
- Identify driver dependencies and find PCIe equivalent.
Fall 2011
- Courses:
- CMPT 880 Programming Parallel and Distributed Systems
December 27
- Isolating interested code from OFED Infiniband software stack.
- Socket switch modul can be used with no change
December 26
- Reading chapters from "Understanding Linux Network Internals"
- Going through the source code of SDP implementation for Infiniband
December 12
- Exploring PLX SDK
- Studying Infiniband implementation of SDP and how to bypass TCP while using standard socket API
- Researching how to create devices in Linux
- Working on course project
November 30
- Collected all the information I need about NTB and SDP.
- Registered to PLX website to download and experiment with their SDK and development kit.
November 14
- Learning how to write kernel modules
- Learning about PCIe protocol layers
- Researching the different types of PCIe bridges and adapters.
September 19
- Here is my progress report [1]
September 12
- Working on my research progress report
Summer 2011
- Courses:
- CMPT 777 Formal Verification
June 17
Studying for the final
June 6
No update
May 8
Beside my course work, I am planning to contunue work on my project. Check the latest report. [2]
Spring 2011
- Courses:
- CMPT 771 Internet Arch and Protocols
- CMPT 886 Special Topics Operating Syst
April 8
Plotted the initial result of an experiment that supports the claim that process network traffic is correlated to execution times which makes network traffic a good metric for performance degradation due to interconnect contention. What is left is to run the experiment two more times to confirm the results.
Mar 14
- Work on Course Projects.
- Experiment with a tool that measure network traffic per process on HPC systems to use it on my project.
Mar7
- Work on Course Projects.
- Read about power consumption effects on multi-core system performance.
Mar 1
- Work on Course Projects.
- Prepare description for my new research project.
Jan21
- I have been working on "Top Ten Computationally-Complex Problems in Oil and Gas Exploration Filed" survey and trying to enrich my overall knowledge about the subject. I am focusing right now on applications based on seismic data and have written an introduction about it in the report [3]
- I have also found a very interesting book "Soft Computing and Intelligent Data Analysis in Oil Exploration" by M. Nikravesh, L.A. Zadeh, Fred Aminzadeh [4]. It is mainly about solving petroleum engineering problems using artificial intelligent techniques, which I think can lead me to find interesting research topic.
Fall 2010
- Courses:
- CMPT 705 Design/Analysis Algorithms
- CMPT 741 Data Mining