UNC COMP 775 Image Processing and Analysis
This recent paper survey gives an overview of 3DMM (PCA/ASM/AAM) : [1909.01815] 3D Morphable Face Models -- Past, Present and Future
Eigenface - Face Recognition using Principal Component Analysis - MachineLearningMastery.com
Problem setup: Assume we have a bunch of pictures of human faces, all in the same pixel dimension (e.g., all are r×c grayscale images). If we get M different pictures and vectorize each picture into L=r×c pixels, we can present the entire dataset as a L×M matrix (let’s call it matrix A), where each element in the matrix is the pixel’s grayscale value. Note that each row/column(depending on pca) of A corresponds to one picture - this is important to understand the following assignments.
Resources
luchaoqi/UNC-COMP-775-Project: UNC COMP 775 final project
Assignment 1 Hough Transform
Hough transform can be used to find circles/disks in images. From high level, we follow the steps as bellows:
- Take the derivatives of each pixel in the image. Especially considering the pixels on the circle , we can have bunch of norm vectors pointing to the center of circle.
- Given a radius (estimation or ground truth of circle), we follow the norm vectors and vote the corresponding possible center i.e. for each pixel there is a corresponding vote as it's center and we can mark the center e.g. +1 in a matrix. Thus we have a accumulative matrix where local maxima could be highly possible center of circle.
- Take the local maxima of accumulative matrix as circle and draw a circle with radius in step 2.
Note that sometimes people can do derivatives only for pixels on margins (using margin/edge detection e.g. Gaussian convolution / Laplacian of gaussian / Canny detection) to save computations.
Due to noise, we can't do above steps on only one pixel but all the pixels on the circle to get an estimation of center. And also due to noise, we do derivatives of gaussian first instead of image data itself in the process of edge detection - Introduction
Assignment 2 Segmentation via Deformable Models Methods
Preliminaries / background:
Index of /rbf/CVonline/LOCAL_COPIES/COOTES
This is important for understanding PDM (representations of point distribution in PCA) / ASM (gradient-driven segmentation) / AAM (intensities-driven segmentation)
Active Shape Model (ASM)
ASM step by step DIP Lecture 25: Active shape models - YouTube
From atlas(reference training data), we want to do segmentation on test data i.e. find landmarks on the test data ()
1) PCA to get the shape space - Given training images , we manually select landmarks/mask for each image and perform PCA on those points to get eigenvectors and eigenvalues (projections on each eigenvector) such that any shape (formed by multiple landmarks) can be represented as (This formula can be understood as representing data in eigenspace and is just the vector from mean pointing to , we use due to dimension reduction)
PCA on all landmarks across all images - not on one image. Imagine have all landmarks from one image on each column so each column represent a shape. See p119b.pdf shown above for details
2) Given test image , we first need to find the target point i.e. the nose point in should correspond to nose point in . So we follow the normal vector of data point in , and find the point in that has the maximum gradient along this direction (large gradient usually means edges) e.g. select 5 data points following the norm vector and 5 data points following the opposite norm vector direction and find the maxima among 11 data points selected.
This assume and are in a common frame coordinate i.e. images after translation/rotation/scale
Also when we follow the normal vector of data point in , we are using as initializations () where starts from 0, and then update iteratively
DIP Lecture 25: Active shape models - YouTube:
3) Now we have an initial target and this is not our final goal - the ground truth target point . There are multiple reasons such that 10 data points might not be enough to reach the ground truth and sometimes we might need to select 20 data points in step2 OR gradient is not appropriate enough etc. We repeat step2 using calculated as initializations to get a new until converged. Then the final that represent is the result.
Our final goal is to find to represent all landmarks (2-n dimension vector) such that .
In summary: ASMs
We assume we have an initial estimate for the pose and shape parameters (eg the mean shape). This is iteratively updated as follows:
- Look along normals through each model point to find the best local match for the model of the image appearance at that point (eg strongest nearby edge)
- Update the pose and shape parameters to best fit the model instance to the found points
- Repeat until convergence
AAM
PCA on shape and grey level information respectively to get , then do PCA on concatenated again.
The motivation is that we want to change the PDM information with constraint of intensities ( driven instead of norm vector driven).
In ASM, we move using normal vectors while in AAM, we move using .
As shown in ASM, the shape of any example (all landmarks) can be summarized by a vector .
Now we consider grey-level intensities, and do the same procedure in ASM and thus any example can be summarized by a vector .
Motivation: since there may be correlation between shape and grey-level variations, we apply a further PCA again to the data as follows
Where is a diagonal matrix of weights for each shape parameter, allowing for the difference in units between the shape and grey models.
We have that can be represented as follows after PCA
previously, we represent the shape and grey-levels of any example without PCA as
now, for any given example, we can represent the shape as well as grey-levels as functions of as our target goal instead of in ASM.
Where are eigenvectors and is a vector of appearance parameters controlling both the shape and grey-levels of the model.
From the lecture slides:
Like the picture shown above, given test image, similar to ASM, we use mean as initialization:
- calculate to calculate using linear regression so we get - note this is the just the difference between reference and target
- apply to test image to move points.
- apply to reference image (atlas) to update intensities information.
- repeat above steps until converged
Assignment 3 Registration Methods
The Mahalanobis distance is a measure between a sample point and a distribution.
Metrics to evaluate data dis-match:
- Sum of squared distance - SSD - fails when contrast is reversed (reverse black and white image)
- normalized correlation
- quantile function - need to choose a feature
- mutual information
Registration:
- Elastic method
- Fluid method
-
LDDMM
-
a) What is the main advantage of elastic registration methods over LDDMM methods? Simple and fast model that enforces smoothness of the displacement field as it regularize of - works pretty well for small displacement. b) What are the two main advantages of LDDMM methods over elastic registration methods? LDDMM works when large deformations are required. It is well known that elastic deformation models cannot guarantee diffeomorphic solutions (folding) if large deformations are required within the registration. LDDMM and fluid-based models can avoid folding.
- As an integral involving only L2 norms give a metric whose geodesics may be used in LDDMM. L2 loss - You wanted the L2 norm on Lv, where L is a differential operator.