Summer 2011 (RA)

Courses: None

working on: Large Scale data processing with MapReduce on GPU/CPU hybrid systems

the report is available here or from this adddress: \students\neshat\Projects\Hadoop-GPU\documents\techReps\doc\doc.pdf

June 6

Looking for how to connect Hadoop to GPU
Implementing C++ version of wors count app for using with Hadoop Stream
Starting to Implement CUDA version of word count app. and connecting it to Hadoop
Examining different configurations of Hadoop to find best match with CUDA/GPU

May 30

Fixing Hadoop on NSL cluster. It is again working and master node is cs-nsl-c01.
Installing Hadoop on Windows in order to work with CUDA SDK.

May 24

working on thesis revision based on received comments.
Re-installing Hadoop on NSL cluster (our recent migrations made some problems, and Hadoop wasn't working. I re-installed it on cs-nsl-c02. The previous installation and files are still available in cs-nsl-c01).
Exploring Hadoop structures and configurations for different modes of operation.
Modifying and running WordCount application on cluster with a part of Reuters corpus as input (768 document).

May 16

explored hadoop scheduling algorithm.
Hadoop on NSL cluster didn't work, so I decided to work my own machine. In the mean while, I was also working on NSL cluster to fix its problem with Hadoop.
collected a data set from Reuters corpus, and used it for predefined word count application in Hadoop package for different configurations.
started to revise thesis based on comments

May 09

added more info to Survay.
started to explore Hadoop on single node.

Spring 2011 (GF)

Courses: None
Submissions:
- Ranking sponsored online ads (NOSSDAV 11)

working on: Large Scale data processing with MapReduce on GPU/CPU hybrid systems

the report is available here or from this adddress: \students\neshat\reports\large_scale_data_processing\doc\doc.pdf

May 02

Worked on Survey over Large Scale data processing with MapReduce on GPU/CPU hybrid systems. The report is available here

April 08

worked on Thesis. First version of introduction, background, first and second chapters are ready. Currently, I am working on conclusion and future works.

April 01

worked on Thesis, first version of introduction and background are ready.

March 21

Revised NOSSDAV paper

March 14

Worked on software implementation of more advanced version of video advertising. Current software loads keywords from XML file, creates video vector, and load interests from .txt file.
Submitted camera ready version of ICME paper
Prepared presentation for ICME paper

Feb 28

continued to revise predicting quality work. Report is accessible from here.
worked on more advance version of advertising on video. report.
Created a new and updated set of common keywords for 55 different topics.

Feb 15

Started to work on more advance version of advertising on video. report.
revised predicting quality work. Report is accessible from here.

Feb 8

continued to revise predicting quality work. Report is accessible from here.
Went over some papers to find solutions for creating dynamic thread in GPU.

Feb 1

Revised predicting quality work. Report is accessible from here.
Started to implement proposed system for using Hadoop over Hybrid CPU/GPU systems

Jan 24

(On Going)Designing high-level architecture of proposed approach for using Hadoop over Hybrid CPU/GPU systems
Read one example of large scale data proc. with map reduce
Read papers about GPU clusters for HPC
Explored Hadoop and its properties like HDFS
Explored Architecture of NVIDIA GPU cluster's arch and specs

Jan 17

read two papers about Phoenix, a mapreduce implementation for multi-core processors.
spent some days to figure out how to use Mark framework and run some samples, but couldn't fully understand. These works has been done:
- Configured system (windows) to run Mars, including cuda and SDK installation as well as VS9 configuring.
- Corrected some typos in the code (library mismatching)
- Asking authors about problems, and got this answer: "I must apologize that mars_v2 is buggy and complex, and we don't maintain the code base any more, I strongly recommend you to try the latest version on linux"
- tried to install mars_v2 on Linux, but it is still buggy and complex. It seems this frame work could run only with certaing configuration, and with older versions of CUDA.
Explored Mars to find its algorithm, and found in co-processing mode (Hybrid) they partition input data into two parts, one for CPU processing, the other for GPU processing. After the map stage, they merge data on CPU side, then dispatch data again to CPU workers and GPU workers.
Looked at phonix, another System for MapReduce Programming from Stanford. It was the comparison base for Mars.
- Spent 2 days for writing resume and being prepared for YouTube interview.

Jan 10

Explored related works and potential ideas

Fall 2010 (TA)

Courses:
- CMPT-820: Multimedia Systems
- CMPT-825: NLP

worked on:
- effective advertising in video

Submissions:
- SmartAd: a smart autonomous system for effective advertising in video (ICME 11)

Summer 2010 (RA)

- Writing for publication

worked on:
- Estimating the click-through rate for new ads with semantic and feature based similarity

algorithms

Spring 2010 (RA)

Courses:
- CMPT-886: Special topics in operation systems

worked on:
- Accelarting online auction using GPU
- Estimating the click-through rate for new ads with semantic and feature based similarity