Difference between revisions of "Private:progress-hamza"
From NMSL
m |
|||
Line 4: | Line 4: | ||
=== Apr 22 === | === Apr 22 === | ||
− | * Downloaded and compiled ITK and VTK. The two libraries are huge and it took at least hours to compile each. They utilize the '''cmake''' utility for configuring and building (even for creating new projects). Managed to read DICOM slices and save them as volume in the MetaImage format. Was also able to render the generated MetaImage using VTK but for some reason I'm getting only a 2D image. | + | * Downloaded and compiled ITK and VTK. The two libraries are huge and it took at least hours to compile each. They utilize the '''cmake''' utility for configuring and building (even for creating new projects), and VTK provides a [http://www.vtk.org/Wiki/CMake:Eclipse_UNIX_Tutorial tutorial] on how to utilize cmake with Eclipse. Managed to read DICOM slices and save them as volume in the MetaImage format. Was also able to render the generated MetaImage using VTK but for some reason I'm getting only a 2D image. |
* In biomedicine, 3-D data are acquired by a multitude of imaging devices [magnetic resonance imaging (MRI), CT, 3-D microscopy, etc.]. In most cases, 3-D images are represented as a sequence of two-dimensional (2-D) parallel image slices. Three-dimensional visualization is a series of theories, methods and techniques, which applies computer graphics, image processing technique and human-computer interacting technique to transform the resulting data from the process of scientific computing to graphics. | * In biomedicine, 3-D data are acquired by a multitude of imaging devices [magnetic resonance imaging (MRI), CT, 3-D microscopy, etc.]. In most cases, 3-D images are represented as a sequence of two-dimensional (2-D) parallel image slices. Three-dimensional visualization is a series of theories, methods and techniques, which applies computer graphics, image processing technique and human-computer interacting technique to transform the resulting data from the process of scientific computing to graphics. | ||
* DICOM files consist of a header and a body of image data. The header contains standardized as well as free-form fields. The set of standardized fields is called the public DICOM dictionary. A single DICOM file can contain multiples frames, allowing storage of volumes or animations. Image data can be compressed using a large variety of standards, including JPEG (both lossy and lossless), LZW (Lempel Ziv Welch), and RLE (Run-length encoding). | * DICOM files consist of a header and a body of image data. The header contains standardized as well as free-form fields. The set of standardized fields is called the public DICOM dictionary. A single DICOM file can contain multiples frames, allowing storage of volumes or animations. Image data can be compressed using a large variety of standards, including JPEG (both lossy and lossless), LZW (Lempel Ziv Welch), and RLE (Run-length encoding). |
Revision as of 16:33, 17 April 2011
Spring 2011 (RA)
- Courses: None
Apr 22
- Downloaded and compiled ITK and VTK. The two libraries are huge and it took at least hours to compile each. They utilize the cmake utility for configuring and building (even for creating new projects), and VTK provides a tutorial on how to utilize cmake with Eclipse. Managed to read DICOM slices and save them as volume in the MetaImage format. Was also able to render the generated MetaImage using VTK but for some reason I'm getting only a 2D image.
- In biomedicine, 3-D data are acquired by a multitude of imaging devices [magnetic resonance imaging (MRI), CT, 3-D microscopy, etc.]. In most cases, 3-D images are represented as a sequence of two-dimensional (2-D) parallel image slices. Three-dimensional visualization is a series of theories, methods and techniques, which applies computer graphics, image processing technique and human-computer interacting technique to transform the resulting data from the process of scientific computing to graphics.
- DICOM files consist of a header and a body of image data. The header contains standardized as well as free-form fields. The set of standardized fields is called the public DICOM dictionary. A single DICOM file can contain multiples frames, allowing storage of volumes or animations. Image data can be compressed using a large variety of standards, including JPEG (both lossy and lossless), LZW (Lempel Ziv Welch), and RLE (Run-length encoding).
- It seems progressive meshes may not be very appropriate for representing the objects in medical applications. The doctors need to slice the object and look at cross sections. Meshes will only show the outer surface.
Apr 15
- Jang et al. proposed a real-time implementation of a multi-view image synthesis system. This implementation is based on lookup tables (LUTs). In their implementation, the sizes of the LUTs for rotation conversion and disparity are 1.1 MBytes and 900 Bytes for each viewpoint, respectively. The processing time to create the left and right images before using LUT was 3.845 sec, which doesn't enable real-time synthesis. Using LUTs reduced the processing time to 0.062 sec.
- Park et al. presented a depth-image-based rendering (DIBR) technique for 3DTV service over terrestrial-digital multimedia broadcasting (T-DMB), the mobile TV standard adopted by Korea. They leverage the previously mentioned real-time view synthesis technique by Jang et al. to overcome the computational cost of generating the auto-stereoscopic image. Moreover, they propose a depth pre-processing method using two adaptive smoothing filters to minimize the amount of resulting holes due to disocclusion during the view synthesis process.
- Gurler et al. presented a multi-core decoding architecture for multiview video encoded in MVC. Their proposal is based on the idea of decomposing the input N-view stream into M-independently decodable sub-streams and performing decoding of each sub-stream by separate threads using multiple instances of the MVC decoder. However, to obtain such independently decodable sub-streams, the video must be encoded using special inter-view prediction schemes depending on the number of cores.
- As indicated by Yuan et al., the distortion of virtual views is influenced by four factors in 3DV systems:
- compression of texture videos and depth maps
- performance of the view synthesis algorithm
- inherent inaccuracy of depth maps
- whether the captured texture videos are well rectified
- Trying to encode two-view texture and depth map streams using JMVC (the multiview reference encoder) to get an idea of how much overhead transmitting an additional view along with depth maps will be incurred when transmitting a 3D video over wireless channels. Managed to compile the source and edit the configuration files, but still get errors when encoding. Looking more into the configuration files parameters.
- Looked more into DICOM slices, it is simply taking parallel 2D sections of an object. Using those slices, and knowing the inter-slice distance, medical imaging software are able to reconstruct the 3D representation. The more recent versions of the DICOM standard enable packaging all the slices into one file to reduce the overhead of headers by eliminating redundant ones.
Apr 8
- Gathered different thoughts from my readings in the Readings and Thoughts section of the 3D Video Remote Rendering and Adaptation System Wiki page.
- Could not find any work on distributed view synthesis.
- I went over the work done by Dr. Hamarneh's students. I read the publications and the report he sent. However, as far as I can see, it is an implementation work for porting an existing open source medical image analysis toolkit to the iOS platform. There are no algorithms or theory involved. That said, one of their future goals is to facilitate reading, writing, and processing of 3D or higher dimensional medical images on iOS (which only supports normal 2D image formats). Current visualization of such imagery on desktop machines is performed via the Visualization ToolKit (VTK). One of their goals is to also port this toolkit to iOS. Another possible tool that I found that is also based on VTK is Slicer, an open source software package for visualization and image analysis.
- Based on my readings progressive mesh streaming, it should be applicable in this context. However, I'm still not familiar with the standard formats and the encoding of such meshes (especially in medical image analysis and visualization applications). Generally, it seems that medical images have their own formats such as the DICOM standard. Their initial thought is to transmit a number of what are known as DICOM slices to the receiver and then the receiver would construct the 3D model from them. So, this is still not very clear to me, as well as whether 3D video technologies may play a role in this.
Mar 14
- Report: here
- Added more details on homographies in the report.
- Implemented double warping and blending, as well as inverse warping using Armadillo C++ linear algebra library.
Mar 7
- Added more detailed description of the view synthesis process.
- Implemented the first phase of the process (forward warping) and the z-buffer competition resolution technique in C/C++. I tested it on the Breakdancers sequence from MSR.
- Working on profiling the code using OProfile to calculate the number of cycles required by the view synthesis process to derive preliminary estimates of power consumption.
- Implementing double warping and a hole filling technique to get a feeling of the final quality that can be obtained.
- Understanding homography matrices and how they are used to speed up the synthesis process.
- Working on deriving a formal analysis of the time complexity of the view synthesis process. The projection phase basically involves a number of matrix multiplications.
Feb 28
Feb 21
- Familiarizing myself with JSVM and its tools and options.
- Contacted the lab that developed the reference software for disparity estimation and view synthesis described in the MPEG technical reports. Still haven't received a reply.
Feb 14
- Reading about SVC and how to perform bitstream extraction
- Reading Cheng's paper on viewing time scalability and Som's IWQoS paper.
- Reading a couple of papers on optimized substream extraction
- Reading papers on modelling the synthesized view distortion in V+D 3D videos
Jan 24
- Report: here
- 3D Video Remote Rendering and Adaptation System
- Market survey:
- The mobile market seems to shifting towards multicore processors. At CES 2011, at least two companies showcased their new mobile phones (LG Optimus 2X and Motorola ATRIX 4G) based on the NVIDIA Tegra 2 dual-core ARM Cortex A9 processor. This looks promising as it may enable smoother graphics capabilities and may be useful for fast view synthesis on the mobile device. However, some evaluation of power consumption needs to be performed. The chip also includes an ultra-low power (ULP) GeForce GPU and is capable of decoding 1080p HD video. Demo Video
- Tablets emerging in the market nowadays are using the Tegra 2 processor (e.g. Dell Streak 7 and Motorola XOOM)
- Qualcomm Snapdragon, Samsung Orion (Video), and Texas Instruments OMAP4 are all dual-core processors expected in the first half of 2011.
- Slides leaked this weekend from NVIDIA's presentation at the Mobile World Congress indicate that the company will be shipping a Tegra 2 3D processor this year intended for use in mobile gadgets featuring a 3D screen! Although this is yet to be confirmed, it is expected that devices such as LG's G-Slate which is expected to have a glasses-free, three-dimensional display and will be shipping around the same time will run on this processor. Moreover, an announcement of a Tegra 3 processor is expected in February.
- The recent release of Gingerbread (Android 2.3) has witnessed a concurrent release of a new NDKr5 which allows application lifecycle management and window management to be performed outside Java. This means an application can be written entirely in C/C++/ARM assembly code without need to develop Java or JNI bindings.
Jan 17
- Concentrating on view synthesis in 3D video systems and read two recent survey papers about the topic.
- Reading about multiple view geometry to understand the warping process and the related terms from epipolar, trifocal, and projective geometry.
- Understanding the commonly used camera pinhole model.
- Reading about stereo-based view synthesis.
- Went over 3 papers on real-time view synthesis using GPUs.
Jan 10
- Exploring potential research directions in 3D videos, including: adaptive virtual view rendering in free-viewpoint video, view synthesis, and rate adaptation in 3D video streaming.
- Investigating the potential of cloud computing as a platform for enabling remote rendering of 3D video for mobile devices.
Fall 2010 (RA)
- Courses:
- CMPT-765: Computer Communication Networks
- Submissions:
- Energy Saving in Multiplayer Mobile Games (TOM'11)
- Publications:
- Energy-Efficient Gaming on Mobile Devices using Dead Reckoning-based Power Management (NetGames'10)
Summer 2010 (DGS-GF)
- Courses: None
- Submissions:
- Energy-Efficient Gaming on Mobile Devices using Dead Reckoning-based Power Management (NetGames'10)
Spring 2010 (TA)
- Courses:
- CMPT-705: Design and Analysis of Algorithms
Fall 2009 (RA)
- Courses:
- CMPT-771: Internet Architecture and Protocols
- Submissions:
- Efficient AS Path Computation and Its Application to Peer Matching (NSDI'10)
Summer 2009 (RA)
- Submissions:
- Efficient Peer Matching Algorithms (CoNEXT'09)
Spring 2009 (TA)
- Courses:
- CMPT-820: Multimedia Systems