CSE 252B: Computer Vision II, Winter 2019 – Assignment 5¶

Instructor: Ben Ochoa¶

Due: Wednesday, March 20, 2019, 11:59 PM¶

Instructions¶

Review the academic integrity and collaboration policies on the course website.
This assignment must be completed individually.
This assignment contains both math and programming problems.
All solutions must be written in this notebook
Math problems must be done in Markdown/LATEX. Remember to show work and describe your solution.
Programming aspects of this assignment must be completed using Python in this notebook.
Your code should be well written with sufficient comments to understand, but there is no need to write extra markdown to describe your solution if it is not explictly asked for.
This notebook contains skeleton code, which should not be modified (This is important for standardization to facilate effeciant grading).
You may use python packages for basic linear algebra, but you may not use packages that directly solve the problem. Ask the instructor if in doubt.
You must submit this notebook exported as a pdf. You must also submit this notebook as an .ipynb file.
Your code and results should remain inline in the pdf (Do not move your code to an appendix).
You must submit both files (.pdf and .ipynb) on Gradescope. You must mark each problem on Gradescope in the pdf.
It is highly recommended that you begin working on this assignment early.

Problem 1 (Math): Point on Line Closest to the Origin (5 points)¶

Given a line $\boldsymbol{l} = (a, b, c)^\top$, show that the point on $\boldsymbol{l}$ that is closest to the origin is the point $\boldsymbol{x} = (-ac, -bc, a^2+b^2)^\top$ (Hint: this calculation is needed in the two-view optimal triangulation method used below).

Answer:

The normal vector of line $\boldsymbol{l} = (a, b, c)^\top$ is $(a, b)^\top$, and the origin in homogeneous coordinates is $\boldsymbol{o} = (0, 0, 1)^\top$. The point on the line orthogonal to $\boldsymbol{l}$ and pass through the origin is $\boldsymbol{x'} = (a, b, 1)^\top$. Thus the line pass through the origin and is orthogonal to $\boldsymbol{l}$ is

$$\boldsymbol{l'} = \boldsymbol{o} \times \boldsymbol{x'} = (0, 0, 1)^\top \times (a, b, 1)^\top = (-b, a, 0)^\top$$

Then, the point on $\boldsymbol{l}$ that is closest to the origin is the intersection of $\boldsymbol{l}$ and $\boldsymbol{l'}$, which is $$\boldsymbol{x} = \boldsymbol{l} \times \boldsymbol{l'} = (a, b, c)^\top \times (-b, a, 0)^\top = (-ac, -bc, a^2+b^2)^\top$$

Problem 2 (Programming): Feature Detection (20 points)¶

Download input data from the course website. The file IMG_5030.JPG contains image 1 and the file IMG_5031.JPG contains image 2.

For each input image, calculate an image where each pixel value is the minor eigenvalue of the gradient matrix

$N=\left[ \begin{array}{cc} \sum\limits_w I_x^2 & \sum\limits_w I_x I_y\\ \sum\limits_w I_x I_y & \sum\limits_w I_y^2 \end{array} \right]$

where w is the window about the pixel, and $I_x$ and $I_y$ are the gradient images in the x and y direction, respectively. Calculate the gradient images using the fivepoint central difference operator. Set resulting values that are below a specified threshold value to zero (hint: calculating the mean instead of the sum in N allows for adjusting the size of the window without changing the threshold value). Apply an operation that suppresses (sets to 0) local (i.e., about a window) nonmaximum pixel values in the minor eigenvalue image. Vary these parameters such that around 1350–1400 features are detected in each image. For resulting nonzero pixel values, determine the subpixel feature coordinate using the Forstner corner point operator.

Report your final values for:¶

the size of the feature detection window (i.e. the size of the window used to calculate the elements in the gradient matrix N)
the minor eigenvalue threshold value
the size of the local nonmaximum suppression window
the resulting number of features detected (i.e. corners) in each image.

Display figures for:¶

original images with detected features, where the detected features are indicated by a square window (the size of the detection window) about the features

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from scipy.signal import convolve2d as conv2d
from scipy import signal

def rgb2gray(rgb):
    return np.dot(rgb[...,:3], [0.299, 0.587, 0.114])

def ImageGradient(I, w, t):
    # inputs: 
    # I is the input image (may be mxn for Grayscale or mxnx3 for RGB)
    # w is the size of the window used to compute the gradient matrix N
    # t is the minor eigenvalue threshold
    #
    # outputs:
    # N is the 2x2xmxn gradient matrix
    # b in the 2x1xmxn vector used in the Forstner corner detector
    # J0 is the mxn minor eigenvalue image of N before thresholding
    # J1 is the mxn minor eigenvalue image of N after thresholding
    
    """your code here"""
    
    m,n = I.shape[:2]
    I = I*255
    I = rgb2gray(I)
    N = np.zeros((2,2,m,n))
    b = np.zeros((2,1,m,n))
    J0 = np.zeros((m,n))
    J1 = np.zeros((m,n))
    
    kx = np.reshape(1/12*np.array([-1, 8, 0, -8, 1]),(1,5))
    ky = kx.transpose()
    Ix = signal.convolve2d(I, kx, boundary='symm', mode='same')
    Iy = signal.convolve2d(I, ky, boundary='symm', mode='same')
    Ix2 = Ix*Ix
    Iy2 = Iy*Iy
    Ixy = Ix*Iy
    X = np.array([[i for i in range(n)] for j in range(m)])
    Y = np.array([[j for i in range(n)] for j in range(m)])
    xIx2_ = X*Ix2 + Y*Ix*Iy
    xIxIy_ = X*Ix*Iy + Y*Iy2
    

    d = int((w-1)/2)
    for y in range(d,m-d):
        for x in range(d,n-d):
            Ix2_win = Ix2[int(y-d):int(y+d+1),int(x-d):int(x+d+1)]
            Iy2_win = Iy2[int(y-d):int(y+d+1),int(x-d):int(x+d+1)]
            Ixy_win = Ixy[int(y-d):int(y+d+1),int(x-d):int(x+d+1)]
            x2 = np.mean(Ix2_win)
            y2 = np.mean(Iy2_win)
            xy = np.mean(Ixy_win)
            nxy = np.array([[x2, xy], [xy, y2]])
            N[:,:,y,x] = nxy
            xIx2_win = xIx2_[int(y-d):int(y+d+1),int(x-d):int(x+d+1)]
            xIxIy_win = xIxIy_[int(y-d):int(y+d+1),int(x-d):int(x+d+1)]
            b1 = np.mean(xIx2_win)
            b2 = np.mean(xIxIy_win)
            b[:,:,y,x] = [[b1],[b2]]
            meig = (np.trace(nxy)- (np.trace(nxy)**2 - 4*np.linalg.det(nxy))**0.5)*0.5
            J0[y,x] = meig
            if meig > t:
                J1[y,x] = meig

    
    return N, b, J0, J1

  
  
def NMS(J, w_nms):
    # Apply nonmaximum supression to J using window w
    # For any window in J, the result should only contain 1 nonzero value
    # In the case of multiple identical maxima in the same window,
    # the tie may be broken arbitrarily
    #
    # inputs: 
    # J is the minor eigenvalue image input image after thresholding
    # w_nms is the size of the local nonmaximum suppression window
    # 
    # outputs:
    # J2 is the mxn resulting image after applying nonmaximum suppression
    # 
    
    J2 = J.copy()
    """your code here"""
    m,n = J.shape[:2]
    d = int((w_nms-1)/2)
    for y in range(d,m-d):
        for x in range(d,n-d):
            Imax = np.max(J[int(y-d):int(y+d+1),int(x-d):int(x+d+1)])
            if J2[y,x] < Imax:
                J2[y,x] = 0
    
    
    return J2
  
  
def ForstnerCornerDetector(J, N, b):
    # Gather the coordinates of the nonzero pixels in J 
    # Then compute the sub pixel location of each point using the Forstner operator
    #
    # inputs:
    # J is the NMS image
    # N is the 2x2xmxn gradient matrix
    # b is the 2x1xmxn vector computed in the image_gradient function
    #
    # outputs:
    # C is the number of corners detected in each image
    # pts is the 2xC list of coordinates of subpixel accurate corners
    #     found using the Forstner corner detector
    
    """your code here"""
    
    
    coord = np.transpose(np.nonzero(J))
    C = coord.shape[0]
    pts = np.zeros((2,C))
    for i in range(coord.shape[0]):
        y = coord[i][0]
        x = coord[i][1]
        A = N[:,:,y,x]
        b_= b[:,:,y,x]
        corner = np.linalg.solve(A,b_)
        corner = corner.reshape((2)).astype(int)
        pts[:,i] = corner
        
    return C, pts


# feature detection
def RunFeatureDetection(I, w, t, w_nms):
    N, b, J0, J1 = ImageGradient(I, w, t)
    J2 = NMS(J1, w_nms)
    C, pts = ForstnerCornerDetector(J2, N, b)

    return C, pts, J0, J1, J2

from PIL import Image
import time

# input images
I1 = np.array(Image.open('IMG_5030.JPG'), dtype='float')/255.
I2 = np.array(Image.open('IMG_5031.JPG'), dtype='float')/255.

# parameters to tune
w = 9
t = 10.5
w_nms = 9

tic = time.time()

# run feature detection algorithm on input images
C1, pts1, J1_0, J1_1, J1_2 = RunFeatureDetection(I1, w, t, w_nms)
C2, pts2, J2_0, J2_1, J2_2 = RunFeatureDetection(I2, w, t, w_nms)
toc = time.time() - tic

print('took %f secs'%toc)

# display results
plt.figure(figsize=(14,24))

# show corners on original images
ax = plt.subplot(1,2,1)
plt.imshow(I1)
for i in range(C1): # draw rectangles of size w around corners
    x,y = pts1[:,i]
    ax.add_patch(patches.Rectangle((x-w/2,y-w/2),w,w, fill=False))
# plt.plot(pts1[0,:], pts1[1,:], '.b') # display subpixel corners
plt.title('Found %d Corners'%C1)

ax = plt.subplot(1,2,2)
plt.imshow(I2)
for i in range(C2):
    x,y = pts2[:,i]
    ax.add_patch(patches.Rectangle((x-w/2,y-w/2),w,w, fill=False))
# plt.plot(pts2[0,:], pts2[1,:], '.b')
plt.title('Found %d Corners'%C2)

plt.show()

/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:62: RuntimeWarning: invalid value encountered in double_scalars

took 103.951598 secs

Final values for parameters¶

w = 9
t = 10.5
w_nms = 9
C1 = 1393
C2 = 1328

Problem 3 (Programming): Feature matching (15 points)¶

Determine the set of one-to-one putative feature correspondences by performing a brute-force search for the greatest correlation coefficient value (in the range [-1, 1]) between the detected features in image 1 and the detected features in image 2. Only allow matches that are above a specified correlation coefficient threshold value (note that calculating the correlation coefficient allows for adjusting the size of the matching window without changing the threshold value). Further, only allow matches that are above a specified distance ratio threshold value, where distance is measured to the next best match for a given feature. Vary these parameters such that around 300 putative feature correspondences are established. Optional: constrain the search to coordinates in image 2 that are within a proximity of the detected feature coordinates in image 1.

Report your final values for:¶

the size of the matching window
the correlation coefficient threshold
the distance ratio threshold
the size of the proximity window (if used)
the resulting number of putative feature correspondences (i.e. matched features)

Display figures for:¶

pair of images, where the matched features are indicated by a square window (the size of the matching window) about the feature and a line segment is drawn from the feature to the coordinates of the corresponding feature in the other image

def compute_ncc(img1, img2,c1,c2,R):
    '''
    helper: compute ncc for two windows
    '''
    matching_score = 0
    num = 0
    sum1 = 0
    sum2 = 0
    mean1 = np.sum(img1[int(c1[1]-R):int(c1[1]+R+1),int(c1[0]-R):int(c1[0]+R+1)])/(R*R)
    mean2 = np.sum(img2[int(c2[1]-R):int(c2[1]+R+1),int(c2[0]-R):int(c2[0]+R+1)])/(R*R)

    for y in range((-R),(R+1)):
        for x in range((-R),(R+1)):
            num += (img1[c1[1]+y,c1[0]+x]-mean1)*(img2[c2[1]+y,c2[0]+x]-mean2)
            sum1 += (img1[c1[1]+y,c1[0]+x]-mean1)**2
            sum2 += (img2[c2[1]+y,c2[0]+x]-mean2)**2
            
    matching_score = num/np.sqrt(sum1*sum2)
    
    return matching_score


def NCC(I1, I2, pts1, pts2, w, p):
    # compute the normalized cross correlation between image patches I1, I2
    # result should be in the range [-1,1]
    #
    # inputs:
    # I1, I2 are the input images
    # pts1, pts2 are the point to be matched
    # w is the size of the matching window to compute correlation coefficients
    # p is the size of the proximity window
    #
    # output:
    # normalized cross correlation matrix of scores between all windows in 
    #    image 1 and all windows in image 2
    
    """your code here"""
    R = int((w-1)/2)
    scores = np.zeros((pts1.shape[1],pts2.shape[1]))
    I1_gray = rgb2gray(I1)
    I2_gray = rgb2gray(I2)
    for i in range(pts1.shape[1]):
        for j in range(pts2.shape[1]):
            c1 = pts1[:,i].astype(int)
            c2 = pts2[:,j].astype(int)
            if np.sqrt((c1[0]-c2[0])**2 + (c1[1]-c2[1])**2) < p:
                if c1[0] > R and c1[0]+R+1 < I1.shape[1] and c1[1] > R and c1[1]+R+1 < I1.shape[0]:
                    if c2[0] > R and c2[0]+R+1 < I2.shape[1] and c2[1] > R and c2[1]+R+1 < I2.shape[0]:
                        scor = compute_ncc(I1_gray,I2_gray,c1,c2,R)
                        scores[i,j] = scor
    
    return scores


def Match(scores, t, d):
    # perform the one-to-one correspondence matching on the correlation coefficient matrix
    # 
    # inputs:
    # scores is the NCC matrix
    # t is the correlation coefficient threshold
    # d distance ration threshold
    #
    # output:
    # list of the feature coordinates in image 1 and image 2 
    
    """your code here"""
    inds = np.zeros((2,1))
    mask = np.ones(scores.shape)
    masked_scores = mask * scores
    max_cc = np.max(masked_scores)
    next_max = 0
    while t < max_cc :
        indices = np.where(masked_scores == max_cc)
        scores[int(indices[0]),int(indices[1])] = -1
        next_max = np.max(scores[int(indices[0]),:])
        
        if np.max(scores[:,int(indices[1])]) > next_max:
            next_max = np.max(scores[:,int(indices[1])])
            
        scores[int(indices[0]),int(indices[1])] = max_cc
        
        if (1-max_cc) < (1-next_max)*d:
            inds = np.hstack((inds,[[int(indices[0])],[int(indices[1])]]))
            
        mask[int(indices[0]),:] = 0
        mask[:,int(indices[1])] = 0
        masked_scores = mask * scores
        max_cc = np.max(masked_scores)
        
    inds = inds[:,1:].astype(int)
    

    return inds



def RunFeatureMatching(I1, I2, pts1, pts2, w, t, d, p=0):
    # inputs:
    # I1, I2 are the input images
    # pts1, pts2 are the point to be matched
    # w is the size of the matching window to compute correlation coefficients
    # t is the correlation coefficient threshold
    # d distance ration threshold
    # p is the size of the proximity window
    #
    # outputs:
    # inds is a 2xk matrix of matches where inds[0,i] indexs a point pts1 
    #     and inds[1,i] indexs a point in pts2, where k is the number of matches
    
    scores = NCC(I1, I2, pts1, pts2, w, p)
    inds = Match(scores, t, d) 
    
    return inds

# parameters to tune
w = 11
t = 0.75
d = 0.87
p = 100

tic = time.time()
# run the feature matching algorithm on the input images and detected features
inds = RunFeatureMatching(I1, I2, pts1, pts2, w, t, d, p)
toc = time.time() - tic

print('took %f secs'%toc)

# create new matrices of points which contain only the matched features 
match1 = pts1[:,inds[0,:]]
match2 = pts2[:,inds[1,:]]

# # display the results
plt.figure(figsize=(14,24))
ax1 = plt.subplot(1,2,1)
ax2 = plt.subplot(1,2,2)
ax1.imshow(I1)
ax2.imshow(I2)
plt.title('Found %d Putative Matches'%match1.shape[1])
for i in range(match1.shape[1]):
    x1,y1 = match1[:,i]
    x2,y2 = match2[:,i]
    ax1.plot([x1, x2],[y1, y2],'-r')
    ax1.add_patch(patches.Rectangle((x1-w/2,y1-w/2),w,w, fill=False))
    ax2.plot([x2, x1],[y2, y1],'-r')
    ax2.add_patch(patches.Rectangle((x2-w/2,y2-w/2),w,w, fill=False))

plt.show()

print('unique points in image 1: %d'%np.unique(inds[0,:]).shape[0])
print('unique points in image 2: %d'%np.unique(inds[1,:]).shape[0])

took 59.355355 secs

unique points in image 1: 301
unique points in image 2: 301

Final values for parameters¶

w = 11
t = 0.75
d = 0.87
p = 100
num_matches = 301

Problem 4 (Programming): Outlier Rejection (20 points)¶

The resulting set of putative point correspondences should contain both inlier and outlier correspondences (i.e., false matches). Determine the set of inlier point correspondences using the M-estimator Sample Consensus (MSAC) algorithm, where the maximum number of attempts to find a consensus set is determined adaptively. For each trial, you must use the 7-point algorithm (as described in lecture) to estimate the fundamental matrix, resulting in 1 or 3 solutions. Calculate the (squared) Sampson error as a first order approximation to the geometric error.

Hint: this problem has codimension 1

Also: fix a random seed in your MSAC. If I cannot reproduce your results, you will lose points.

Report your values for:¶

the probability $p$ that as least one of the random samples does not contain any outliers
the probability $\alpha$ that a given point is an inlier
the resulting number of inliers
the number of attempts to find the consensus set
the tolerance for inliers
the cost threshold
random seed

Display figures for:¶

pair of images, where the inlier features in each of the images are indicated by a square window about the feature and a line segment is drawn from the feature to the coordinates of the corresponding feature in the other image

from scipy.stats import chi2
import random
from sympy.solvers import solve
from sympy import Symbol
from sympy import *

def Homogenize(x):
    # converts points from inhomogeneous to homogeneous coordinates
    return np.vstack((x,np.ones((1,x.shape[1]))))

def Dehomogenize(x):
    # converts points from homogeneous to inhomogeneous coordinates
    return x[:-1]/x[-1]

def Normalize(pts):
    # data normalization of n dimensional pts
    #
    # Input:
    #    pts - is in inhomogeneous coordinates
    # Outputs:
    #    pts - data normalized points
    #    T - corresponding transformation matrix
    """your code here"""
    homo_pts = Homogenize(pts)
    mean = np.mean(pts,axis=1)
    # for 2D points
    if pts.shape[0] == 2:
        s = (2/np.sum(np.var(pts,axis=1)))**0.5
        T = np.array([[s,0,-mean[0]*s],
                      [0,s,-mean[1]*s],
                      [0,0,1]])
    # for 3D points
    if pts.shape[0] == 3:
        s = (3/np.sum(np.var(pts,axis=1)))**0.5
        T = np.array([[s, 0, 0, -mean[0]*s],
                      [0, s, 0, -mean[1]*s],
                      [0, 0, s, -mean[2]*s],
                      [0, 0, 0, 1]])

    pts = T @ homo_pts
    
    return pts, T

def sampson_error(F,x1,x2):
    # calculate sampson error of x1 and x2
    # x1, x2 are inhomo points
    error = np.zeros((x1.shape[1]))
    delta = np.zeros((4,x1.shape[1]))
    x1 = Homogenize(x1)
    x2 = Homogenize(x2)
    for i in range(x1.shape[1]):
        epsilon = x2[:,i].reshape(1,-1) @ F @ x1[:,i].reshape(-1,1)
        
        J = np.array([x2[0,i] * F[0,0] + x2[1,i] * F[1,0] + F[2,0],
                      x2[0,i] * F[0,1] + x2[1,i] * F[1,1] + F[2,1],
                      x1[0,i] * F[0,0] + x1[1,i] * F[0,1] + F[0,2],
                      x1[0,i] * F[1,0] + x1[1,i] * F[1,1] + F[1,2]])
        J = J.reshape(1,-1)
        lam = np.linalg.solve((J @ J.T),(-epsilon))
        lam = lam.reshape(-1,1)
        delta_x = (J.T @ lam).reshape(-1,1)
        error[i] = delta_x.T @ delta_x
        delta[:,i] = delta_x.reshape(-1)
        
    return error ,delta


def MSAC(pts1, pts2, thresh, tol, p):
    # Inputs:
    #    pts1 - matched feature correspondences in image 1
    #    pts2 - matched feature correspondences in image 2
    #    thresh - cost threshold
    #    tol - reprojection error tolerance 
    #    p - probability that as least one of the random samples does not contain any outliers   
    #
    # Output:
    #    consensus_min_cost - final cost from MSAC
    #    consensus_min_cost_model - fundamental matrix F
    #    inliers - list of indices of the inliers corresponding to input data
    #    trials - number of attempts taken to find consensus set
    
    """your code here"""
    x1_homo = Homogenize(pts1)
    x2_homo = Homogenize(pts2)
    x1_inhomo = pts1
    x2_inhomo = pts2
    p = 0.99
    s = 7
    trials = 0
    max_trials = np.inf
    consensus_min_cost = np.inf
    consensus_min_cost_model = np.zeros((3,3))
    random.seed(66)
    random_seq = np.zeros((100,7)).astype(int)
    for i in range(100):
        random_seq[i,:] = random.sample(range(0, x1_homo.shape[1]), 7)
    while (trials < max_trials and consensus_min_cost > thresh):
        trials = trials + 1
        # random samples
        num = random_seq[trials,:]
        x1_random = x1_homo[:,num]
        x2_random = x2_homo[:,num]
        x1_norm, T1 = Normalize(Dehomogenize(x1_random))
        x2_norm, T2 = Normalize(Dehomogenize(x2_random))

        # calculate model
        A = np.zeros((s,9))
        for i in range(s):
            A[i,:] = np.kron(x2_norm[:,i],x1_norm[:,i])
        _,_,vh = np.linalg.svd(A)
        
        a = vh[-1,:]
        b = vh[-2,:]
        F1 = Matrix([[a[0],a[1],a[2]],
                     [a[3],a[4],a[5]],
                     [a[6],a[7],a[8]]])
        F2 = Matrix([[b[0],b[1],b[2]],
                     [b[3],b[4],b[5]],
                     [b[6],b[7],b[8]]])
        alpha = Symbol('alpha')
        sol = solve((alpha*F1+F2).det(), alpha)
        sol = np.array(sol).astype(np.complex64)
        sol = sol[np.abs(sol.imag)<1e-18].real
        F1 = np.array(F1).astype(np.float64)
        F2 = np.array(F2).astype(np.float64)
        for i in range(sol.shape[0]):
            F_temp = sol[i]*F1 + F2
            F = T2.T @ F_temp @ T1
            # calculate error
            error,_ = sampson_error(F,x1_inhomo,x2_inhomo)
            # calculate cost
            cost = 0
            for j in range(x1_homo.shape[1]):
                if error[j] <= tol:
                    cost = cost + error[j]
                else:
                    cost = cost + tol
                    
            if cost < consensus_min_cost:            
                consensus_min_cost = cost
                consensus_min_cost_model = F
                n_inliers = (error <= tol)
                w = np.sum(n_inliers)/x1_homo.shape[1]
                max_trials = np.log(1-p) / np.log(1-w**s)
                inliers = np.where(n_inliers == True)[0]
        
        
    

    return consensus_min_cost, consensus_min_cost_model, inliers, trials


# MSAC parameters 
thresh = 300
tol = chi2.ppf(0.95,1)
p = 0.99
alpha = 0.95

tic=time.time()

cost_MSAC, F_MSAC, inliers, trials = MSAC(match1, match2, thresh, tol, p)

# choose just the inliers
x1 = match1[:,inliers]
x2 = match2[:,inliers]
outliers = np.setdiff1d(np.arange(match1.shape[1]),inliers)

toc=time.time()
time_total=toc-tic

# display the results
print('took %f secs'%time_total)
print('%d iterations'%trials)
print('inlier count: ',len(inliers))
print('inliers: ',inliers)
print('MSAC Cost = %.9f'%cost_MSAC)
print('F_MSAC = ')
print(F_MSAC)

# display the figures
"""your code here"""
plt.figure(figsize=(14,8))
ax1 = plt.subplot(1,2,1)
ax2 = plt.subplot(1,2,2)
ax1.imshow(I1)
ax2.imshow(I2)
plt.title('found %d inliers'%len(inliers))
for i in range(x1.shape[1]):
    x_1,y_1 = x1[:,i]
    x_2,y_2 = x2[:,i]
    ax1.plot([x_1, x_2],[y_1, y_2],'-r')
    ax1.add_patch(patches.Rectangle((x_1-w/2,y_1-w/2),w,w, fill=False))
    ax2.plot([x_2, x_1],[y_2, y_1],'-r')
    ax2.add_patch(patches.Rectangle((x_2-w/2,y_2-w/2),w,w, fill=False))


plt.show()

took 18.347992 secs
44 iterations
inlier count:  217
inliers:  [  0   1   2   3   4   6   8   9  10  12  13  14  15  16  18  20  21  23
  26  27  29  30  33  34  36  37  38  40  41  43  44  45  46  47  48  49
  50  51  52  54  55  56  59  60  61  63  66  67  68  70  71  74  76  77
  78  79  80  82  84  85  88  89  90  92  93  94  95  96  97  99 102 103
 105 106 108 109 110 111 112 113 114 115 116 118 119 120 121 123 125 128
 129 131 132 133 136 137 138 139 141 142 143 144 145 147 149 150 152 153
 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 171 173
 174 177 178 179 181 182 183 185 186 187 189 193 195 196 198 199 200 203
 204 205 206 207 208 209 210 211 212 214 215 216 217 218 219 221 223 224
 226 227 228 229 230 231 232 233 234 236 237 239 240 241 242 245 246 248
 251 252 253 254 255 256 257 258 260 263 265 266 268 269 270 272 273 274
 275 277 278 280 281 282 283 284 285 286 287 288 289 291 292 294 296 297
 298]
MSAC Cost = 460.360347495
F_MSAC = 
[[ 8.22155197e-09  8.50600330e-07 -3.76180844e-04]
 [-4.46786597e-07  2.97098426e-07 -4.00160051e-03]
 [ 2.61797261e-04  3.55499347e-03  3.85334949e-01]]

Final values for parameters¶

random seed = 66
$p$ = 0.99
$\alpha$ = 0.95
tolerance = 3.8415
threshold = 300
num_inliers = 217
num_attempts = 44
consensus_min_cost = 460.36

Problem 5 (Programming): Linear Estimation of the Fundamental Matrix (15 points)¶

Estimate the fundamental matrix $\boldsymbol{F}_\text{DLT}$ from the resulting set of inlier correspondences using the direct linear transformation (DLT) algorithm (with data normalization). Include the numerical values of the resulting $\boldsymbol{F}_\text{DLT}$, scaled such that $||\boldsymbol{F}_\text{DLT}||_\text{Fro} = 1$

def Homogenize(x):
    # converts points from inhomogeneous to homogeneous coordinates    
    return np.vstack((x,np.ones((1,x.shape[1]))))

def Dehomogenize(x):
    # converts points from homogeneous to inhomogeneous coordinates   
    return x[:-1]/x[-1]

def DLT(x1, x2, normalize=True):
    # Inputs:
    #    x1 - inhomogeneous inlier correspondences in image 1
    #    x2 - inhomogeneous inlier correspondences in image 2
    #    normalize - if True, apply data normalization to x1 and x2
    #
    # Outputs:
    #    F - the DLT estimate of the fundamental matrix  
    
    """your code here"""
    
    # data normalization
    if normalize:
        x1, T1 = Normalize(x1)
        x2, T2 = Normalize(x2)
    else:
        x1 = Homogenize(x1)
        x2 = Homogenize(x2)
            
        
    n = x1.shape[1]
    # calculate F
    A = np.zeros((n,9))
    for i in range(n):
        A[i,:] = np.kron(x2[:,i],x1[:,i])

    _,_,vh = np.linalg.svd(A)
    f = vh[-1,:]
    F = f.reshape(3,3)
    # enforce singularity constraint
    u,s,vh = np.linalg.svd(F)
    s[-1] = 0
    F = u @ np.diag(s) @ vh
        
    # data denormalization
    if normalize:
        F = T2.T @ F @ T1
    
    F = F/np.linalg.norm(F,'fro')
    return F


# compute the linear estimate with data normalization
print ('DLT with Data Normalization')
time_start=time.time()
F_DLT = DLT(x1, x2, normalize=True)
time_total=time.time()-time_start

# display the resulting F_DLT, scaled with its frobenius norm
print('F_DLT =')
print(F_DLT)

DLT with Data Normalization
F_DLT =
[[ 2.31786548e-08  2.79558278e-06 -1.15045676e-03]
 [-1.72890403e-06  7.36279460e-07 -1.02439373e-02]
 [ 8.98770895e-04  9.04013306e-03  9.99905599e-01]]

Problem 6 (Programming): Nonlinear Estimation of the Fundamental Matrix (70 points)¶

Retrieve the camera projection matrices $\boldsymbol{P} = [\boldsymbol{I} \,|\, \boldsymbol{0}]$ and $\boldsymbol{P}' = [\boldsymbol{M} \,|\, \boldsymbol{v}]$, where $\boldsymbol{M}$ is full rank, from $\boldsymbol{F}_\text{DLT}$. Use the resulting camera projection matrix $\boldsymbol{P}'$ associated with the second image and the triangulated 3D points as an initial estimate to an iterative estimation method, specifically the sparse Levenberg-Marquardt algorithm, to determine the Maximum Likelihood estimate of the fundamental matrix $\boldsymbol{F} = [\boldsymbol{v}]_\times \boldsymbol{M}$ that minimizes the reprojection error. The initial estimate of the 3D points must be determined using the two-view optimal triangulation method described in lecture (algorithm 12.1 in the Hartley \& Zisserman book, but use the ray-plane intersection method for the final step instead of the homogeneous method). Additionally, you must parameterize the camera projection matrix $\boldsymbol{P}'$ associated with the second image and the homogeneous 3D scene points that are being adjusted using the parameterization of homogeneous vectors (see section A6.9.2 (page 624) of the textbook, and the corrections and errata).

Report the initial cost (i.e. cost at iteration 0) and the cost at the end of each successive iteration. Show the numerical values for the final estimate of the fundamental matrix $\boldsymbol{F}_\text{LM}$, scaled such that $||\boldsymbol{F}_\text{LM}||_\text{Fro} = 1$.

# Note that np.sinc is different than defined in class
def Sinc(x):
    # Returns a scalar valued sinc value
    """your code here"""
    
    if x == 0:
        y = 1
    else :
        y = np.sin(x)/x
    
    return y

def d_sinc(x):
    if x == 0:
        return 0
    else:
        return (np.cos(x)/(x)-np.sin(x)/(x**2))

def Parameterize(H):
    # wrapper function to interface with LM
    # takes all optimization variables and parameterizes all of them
    return ParameterizeHomog(H.reshape(-1,1))


def Deparameterize(p):
    # Deparameterize all optimization variables
    return DeParameterizeHomog(p).reshape(3,4)


def ParameterizeHomog(V):
    # Given a homogeneous vector V return its minimal parameterization
    """your code here"""
    V = V/np.linalg.norm(V)
    V = V.reshape(-1,1)
    a = V[0]
    b = V[1:]
    v_hat = (2/(Sinc(np.arccos(a))))*b
    if np.linalg.norm(v_hat) > np.pi:
        v_hat = (1 - 2*np.pi/np.linalg.norm(v_hat)*
                 np.ceil((np.linalg.norm(v_hat)-np.pi)/(2*np.pi)))*v_hat
    
    
    return v_hat


def DeParameterizeHomog(v):
    # Given a parameterized homogeneous vector return its deparameterization
    """your code here"""

    v_norm = np.linalg.norm(v)
    a = np.cos(v_norm/2).reshape(-1,1)
    b = (Sinc(v_norm/2)/2)*v
    b = b.reshape(-1,1)
    v_bar = np.vstack((a,b))
    
    return v_bar


def d_V_v(v):
    # v is parametrized vector
    v = v.reshape(-1,1)
    n = v.shape[0]
    if np.linalg.norm(v) == 0:
        dV_v = np.concatenate([np.zeros((1,n)),np.eye(n)*0.5],axis = 0)
    else:
        v_norm = np.linalg.norm(v)
        a = np.cos(v_norm/2).reshape(-1,1)
        b = (Sinc(v_norm/2)/2)*v
        b = b.reshape(-1,1)
        db_v = ((Sinc(v_norm/2)/2)*np.eye(n)
                + 1/(4*v_norm)*d_sinc(v_norm/2)*(v @ v.T))
        dV_v = np.concatenate([b.T*(-0.5),db_v],axis = 0)
    return dV_v

def skew(w):
    # Returns the skew-symmetrix represenation of a vector
    """your code here"""
    w_skew = np.array([[0, -w[2], w[1]],
                       [w[2],0,-w[0]],
                       [-w[1],w[0],0]])
    
    return w_skew

def F_to_P(F):
    _,_,vh_ = np.linalg.svd(F.T)
    e_null = -vh_[-1,:]
    Z = np.array([[0,-1,0],
                  [1,0,0],
                  [0,0,1]])
    W = np.array([[0,1,0],
                  [-1,0,0],
                  [0,0,0]])
    u,s,vh = np.linalg.svd(F)
    s[2] = (s[0]+s[1])/2
    M = u @ Z @ np.diag(s) @ vh
    P2 = np.hstack((M,e_null.reshape(-1,1)))
    return P2


    
def triangulation(F,x1_homo,x2_homo):
    # x1,x2 homo 2D points
    # F fundamental matrox
    X_scene = np.zeros((4,x2_homo.shape[1]))
    for i in range(x2_homo.shape[1]):
        T1 = np.array([[x1_homo[2,i],0,-x1_homo[0,i]],
                       [0,x1_homo[2,i],-x1_homo[1,i]],
                       [0,0,x1_homo[2,i]]])
        T2 = np.array([[x2_homo[2,i],0,-x2_homo[0,i]],
                       [0,x2_homo[2,i],-x2_homo[1,i]],
                       [0,0,x2_homo[2,i]]])
        Fs = np.linalg.inv(T2) @ F @ np.linalg.inv(T1)
        u,_,vh = np.linalg.svd(Fs)
        e1 = vh[-1,:]
        u,_,vh = np.linalg.svd(Fs.T)
        e2 = vh[-1,:]
        e1 = np.sqrt(1 / (e1[0]**2 + e1[1]**2)) * e1
        e2 = np.sqrt(1 / (e2[0]**2 + e2[1]**2)) * e2
        R1 = np.array([[e1[0], e1[1], 0],
                       [-e1[1], e1[0], 0],
                       [0,0,1]])
        R2 = np.array([[e2[0], e2[1], 0],
                       [-e2[1], e2[0], 0],
                       [0,0,1]])
        Fs = R2 @ Fs @ R1.T
        a = Fs[1,1]
        b = Fs[1,2]
        c = Fs[2,1]
        d = Fs[2,2]
        f1 = e1[2]
        f2 = e2[2]
        # find roots
        t = Symbol('t')
        eq = (t * ((a * t + b)**2 + f2**2 * (c * t + d)**2)**2 
              - (a * d - b * c) *(1 + f1**2 * t**2)**2 * (a * t + b) * (c * t + d))
        t_roots = solve(eq,t)
        t_roots = np.array(t_roots).astype(np.complex64).real
        # find t_min
        st_min = np.inf
        t_min = 0
        for j in range(t_roots.shape[0]):
            if t_roots[j] > 1e10:
                st = 1/(f1**2) + (c**2) / (a**2 + (f2**2) * (c**2)) 
            else:
                st = ((t_roots[j]**2) / (1+(f1**2)*(t_roots[j]**2)) 
                      + ((c*t_roots[j]+d)**2) / ((a*t_roots[j]+b)**2 
                                                 + f2**2 *(c*t_roots[j]+d)**2))
            if st < st_min:
                st_min = st
                t_min = t_roots[j]
        l1 = np.array([t_min*f1,1,-t_min])
        l2 = np.array([-f2*(c*t_min+d),a*t_min+b,c*t_min+d])
        xs1 = np.array([-l1[0]*l1[2],-l1[1]*l1[2],l1[0]**2+l1[1]**2]).reshape(-1,1)
        xs2 = np.array([-l2[0]*l2[2],-l2[1]*l2[2],l2[0]**2+l2[1]**2]).reshape(-1,1)
        x1_opt = np.linalg.inv(T1) @ R1.T @ xs1
        x2_opt = np.linalg.inv(T2) @ R2.T @ xs2
        l2 = F @ x1_opt
        l2_ort = np.array([-l2[1] * x2_opt[2], l2[0] * x2_opt[2], l2[1] * x2_opt[0] - l2[0] * x2_opt[1]]).reshape(-1,1)
        P2 = F_to_P(F)
        plane = P2.T @ l2_ort
        X_scene[:, i] = np.vstack((plane[3] * x1_opt,-plane[0:3].T @ x1_opt)).reshape(-1)
    
    return X_scene

def Jacobian(P2,X_scene):
    # X_scene 4 X n
    # P2 3X4
    p2_hat = Parameterize(P2).reshape(-1,1)
    P1 = np.hstack((np.eye(3),np.zeros((3,1))))
    x1_hat = Dehomogenize(P1 @ X_scene)
    x2_hat = Dehomogenize(P2 @ X_scene)
    X_hat = np.zeros((3,1))
    for i in range(X_scene.shape[1]):
        X_hat = np.hstack((X_hat,ParameterizeHomog(X_scene[:,i])))
    X_hat = X_hat[:,1:] # 3 X n

    A = np.zeros((1,11))
    B1 = np.zeros((1,3))
    B2 = np.zeros((1,3))
    for i in range(x1_hat.shape[1]):
        #  A
        xi_hat = x2_hat[:,i]
        Xi = X_scene[:,i].reshape(-1,1)
        w = P2[2,:].reshape(1,-1) @ Xi
        dx_hat_P = (1/w)*np.concatenate([np.concatenate([Xi.T,
                                                         np.zeros((1,4)),
                                                         -xi_hat[0]*Xi.T],axis = 1),
                                         np.concatenate([np.zeros((1,4)),
                                                         Xi.T,
                                                         -xi_hat[1]*Xi.T],axis = 1)],
                                        axis = 0)
        dP_p = d_V_v(p2_hat)

        Ai = np.dot(dx_hat_P, dP_p)
        A = np.vstack((A,Ai))

        # B1
        w = P1[2,:].reshape(1,-1) @ Xi
        dx1_x = np.vstack((P1[0,:] - x1_hat[0,i]*P1[2,:],
                           P1[1,:] - x1_hat[1,i]*P1[2,:]))
        dx1_x = dx1_x / w
        dX_x = d_V_v(X_hat[:,i])
        B1i = dx1_x @ dX_x
        B1 = np.vstack((B1,B1i))

        # B2
        w = P2[2,:].reshape(1,-1) @ Xi
        dx2_x = np.vstack((P2[0,:] - x2_hat[0,i]*P2[2,:],
                           P2[1,:] - x2_hat[1,i]*P2[2,:]))
        dx2_x = dx2_x / w
        dX_x = d_V_v(X_hat[:,i])
        B2i = dx2_x @ dX_x
        B2 = np.vstack((B2,B2i))

    A = A[1:,:]
    B1= B1[1:,:]
    B2= B2[1:,:]
    return A,B1,B2

from scipy.linalg import block_diag

def LM(F, x1, x2, max_iters, lam):
    # Input:
    #    F - DLT estimate of the fundamental matrix
    #    x1 - inhomogeneous inlier points in image 1
    #    x2 - inhomogeneous inlier points in image 2
    #    max_iters - maximum number of iterations
    #    lam - lambda parameter
    # Output:
    #    F - Final fundamental matrix obtained after convergence
    
   
    
    """your code here"""
    n = x1.shape[1]
    x1_inhomo = x1
    x1_homo = Homogenize(x1_inhomo)
    x2_inhomo = x2
    x2_homo = Homogenize(x2_inhomo)


    # calculate X by triangulation
    #X_scene = triangulation(F,x1_homo,x2_homo) 
    # it is very slow when using the sumpy solver, so I store the result for future use
    X_scene = np.load('X_scene66.npy')

    # F to P
    P1 = np.hstack((np.eye(3),np.zeros((3,1))))
    P2 = F_to_P(F)

    # parameterize
    X_hat = np.zeros((3,1))
    for i in range(X_scene.shape[1]):
        X_hat = np.hstack((X_hat,ParameterizeHomog(X_scene[:,i])))
    X_hat = X_hat[:,1:] # 3 X n
    p2_hat = Parameterize(P2).reshape(-1,1)

    # step 1
    # deparametrize
    X_scene = np.zeros((4,1))
    for i in range(X_hat.shape[1]):
        X_scene = np.hstack((X_scene,DeParameterizeHomog(X_hat[:,i])))
    X_scene = X_scene[:,1:] # 4 X n
    P2 = Deparameterize(p2_hat)

    x1_hat = P1 @ X_scene
    x2_hat = P2 @ X_scene
    x1_hat_inhomo = Dehomogenize(x1_hat)
    x2_hat_inhomo = Dehomogenize(x2_hat)
    epsilon1 = (x1_inhomo - x1_hat_inhomo)
    e1 = epsilon1.reshape(-1,1)
    epsilon2 = (x2_inhomo - x2_hat_inhomo)
    e2 = epsilon2.reshape(-1,1)
    cost = e1.T @ e1 + e2.T  @ e2
    iters = 0
    
    print ('iter %03d Cost %.9f'%(0, cost))
    
    for i in range(max_iters): 
        # step 2
        A,B1,B2 = Jacobian(P2,X_scene)
        # step 3
        # normal equation matrix
        U = A.T @ A
        V = np.zeros((3,3,n))
        for j in range(n):
            V[:,:,j] = ((B1[2*j:2*j+2,:].T @ B1[2*j:2*j+2,:])
                        +(B2[2*j:2*j+2,:].T @ B2[2*j:2*j+2,:]))
        W = np.zeros((11,3,n))
        for j in range(n):
            W[:,:,j] = (A[2*j:2*j+2,:].T @ B2[2*j:2*j+2,:])

        # normal equation vector
        epsilon_a = np.zeros((11,1))
        epsilon_b = np.zeros((3,n))
        for j in range(n):
            epsilon_a = epsilon_a + (A[2*j:2*j+2,:].T 
                                     @ epsilon2[:,j]).reshape(-1,1)

            epsilon_b[:,j] = ((B1[2*j:2*j+2,:].T @ epsilon1[:,j])
                             + (B2[2*j:2*j+2,:].T @ epsilon2[:,j]))
        flag = 1
        while flag:
            #step 4
            delta_b = np.zeros((3,n))
            S = U + lam*np.eye(11)
            e = epsilon_a
            for j in range(n):
                S = S - (W[:,:,j] 
                         @ np.linalg.inv(V[:,:,j]+lam*np.eye(3)) 
                         @ W[:,:,j].T)
                e = e - (W[:,:,j] 
                         @ np.linalg.inv(V[:,:,j]+lam*np.eye(3)) 
                         @ epsilon_b[:,j]).reshape(-1,1)
            delta_a = np.linalg.solve(S,e)
            for j in range(n):
                delta_b[:,j] = (np.linalg.inv(V[:,:,j]+lam*np.eye(3)) 
                                @ (epsilon_b[:,j] - (W[:,:,j].T @ delta_a).reshape(-1)))

            #step 5
            p2_hat0 = p2_hat + delta_a
            X_hat0 = X_hat + delta_b

            #step 6
            # deparametrize
            X_scene0 = np.zeros((4,1))
            for j in range(X_hat.shape[1]):
                X_scene0 = np.hstack((X_scene0,DeParameterizeHomog(X_hat0[:,j])))
            X_scene0 = X_scene0[:,1:] # 4 X n
            P2_0 = Deparameterize(p2_hat0)
            # cost

            x1_hat0 = P1 @ X_scene0
            x2_hat0 = P2_0 @ X_scene0
            x1_hat_inhomo0 = Dehomogenize(x1_hat0)
            x2_hat_inhomo0 = Dehomogenize(x2_hat0)
            epsilon10 = (x1_inhomo - x1_hat_inhomo0)
            e10 = epsilon10.reshape(-1,1)
            epsilon20 = (x2_inhomo - x2_hat_inhomo0)
            e20 = epsilon20.reshape(-1,1)
            cost0 = e10.T @ e10 + e20.T  @ e20
            cost = e1.T @ e1 + e2.T  @ e2

            #step 7
            if cost0 < cost:
                p2_hat = p2_hat0
                X_hat = X_hat0
                X_scene = X_scene0
                P2 = P2_0
                e1 = e10
                e2 = e20
                epsilon1 = epsilon10
                epsilon2 = epsilon20
                lam = 0.1 * lam
                iters = iters + 1
                flag = 0
            else :
                lam = 10 * lam
            
        print ('iter %03d Cost %.9f'%(i+1, cost0))
        # terminate condition
        if 1e-7 > (1 - (cost0 / cost)):
            break
    
    # P to F
    M = P2[:,:3]
    e_null = P2[:,-1]
    F = skew(e_null) @ M
    F = F/np.linalg.norm(F,'fro')
    
    return F

# LM hyperparameters
lam = .001
max_iters = 20

# Run LM initialized by DLT estimate
print ('Sparse LM')
time_start=time.time()
F_LM = LM(F_DLT, x1, x2, max_iters, lam)
time_total=time.time()-time_start
print('took %f secs'%time_total)

# display the resulting F_LM, scaled with its frobenius norm
print('F_LM =')
print(F_LM)

Sparse LM
iter 000 Cost 166.411541204
iter 001 Cost 74.968951823
iter 002 Cost 71.891919726
iter 003 Cost 71.862857055
iter 004 Cost 71.862856557
took 0.288722 secs
F_LM =
[[ 1.85141856e-08  2.89140860e-06 -1.12560452e-03]
 [-1.81539441e-06  7.36614861e-07 -1.03166359e-02]
 [ 8.83621477e-04  9.11344208e-03  9.99904228e-01]]

Sparse LM iter 000 Cost 166.411541204 iter 001 Cost 74.968951823 iter 002 Cost 71.891919726 iter 003 Cost 71.862857055 iter 004 Cost 71.862856557 took 0.310564 secs F_LM = [[ 1.85141856e-08 2.89140860e-06 -1.12560452e-03] [-1.81539441e-06 7.36614861e-07 -1.03166359e-02] [ 8.83621477e-04 9.11344208e-03 9.99904228e-01]]

Problem 7 (Programming): Point to Line Mapping (10 points)¶

Qualitatively determine the accuracy of $\boldsymbol{F}_\text{LM}$ by mapping points in image 1 to epipolar lines in image 2. Choose three distinct points $\boldsymbol{x}_{\{1,2,3\}}$ distributed in image 1 that are not in the set of inlier correspondences and map them to epipolar lines $\boldsymbol{l'}_{\{1,2,3\}} = \boldsymbol{F}_\text{LM} \boldsymbol{x}_{\{1,2,3\}}$ in the second image under the fundamental matrix $\boldsymbol{F}_\text{LM}$.

Include a figure containing the pair of images, where the three points in image 1 are indicated by a square (or circle) about the feature and the corresponding epipolar lines are drawn in image 2. Comment on the qualitative accuracy of the mapping. (Hint: each line $\boldsymbol{l'}_i$ should pass through the point $\boldsymbol{x'}_i$ in image 2 that corresponds to the point $\boldsymbol{x}_i$ in image 1).

"""your code here"""
# pick three outliers
outliers = np.setdiff1d(np.arange(match1.shape[1]),inliers)
x1_outliers = match1[:,outliers]
x2_outliers = match2[:,outliers]
num = random.sample(range(0, x1_outliers.shape[1]), 3)
x1_outliers = x1_outliers[:,num]
x2_outliers = x2_outliers[:,num]
x1_homo = Homogenize(x1_outliers)
x2_homo = Homogenize(x2_outliers)

# plot epipolar lines
plt.figure(figsize=(15,15))
ax1 = plt.subplot(1,2,1)
ax2 = plt.subplot(1,2,2)
ax1.imshow(I1)
ax2.imshow(I2)
ax1.set_title("Image 1")
ax2.set_title("Image 2")
w = 15
for i in range(x1_outliers.shape[1]):
    l = F_LM @ x1_homo[:,i]
    x_1,y_1 = x1_outliers[:,i]
    x_2,y_2 = x2_outliers[:,i]
    ax1.add_patch(patches.Rectangle((x_1-w/2,y_1-w/2),w,w, fill=False, color = 'r',linewidth = 2))
    x = I1.shape[1]
    ax2.plot([0, x], [(-l[2]/l[1]),((-l[2]/l[1])- (l[0]*x/l[1]))],color = 'r',linewidth=2)
    ax2.axis([0,I1.shape[1],I1.shape[0],0])


plt.show()

Comment:

As we can see, each epipolar line $\boldsymbol{l'}_i$ passes through the point $\boldsymbol{x'}_i$ in image 2 that corresponds to the point $\boldsymbol{x}_i$ in image 1. Thus, the fundamental matrix $\boldsymbol{F}_\text{LM}$ is accurate.

Problem 8 (Programming): Projective to Euclidean Reconstruction (15 points)¶

You are given a Matlab file containing points obtained from applying three-view geometry techniques (using the trifocal tensor) to obtain a projective reconstruction of points from a 3D scene. Also in the file are groundtruth control points. Compute the homography transformation using the DLT along with the projected 3D scene points and control points to upgrade the projective reconstruction to a Euclidean reconstruction. Render the scene, and comment on your results. What does the scene look like? (You may have to rotate the plot to get a better view.)

from mpl_toolkits.mplot3d import Axes3D
import scipy.io as sio

reconstruction = sio.loadmat('ereconstruction.mat')
X_projective = reconstruction['X_projective']
X_projective = X_projective.T
X_control = reconstruction['X_c']
X_control = X_control.T

def ComputeHomography(Xp, Xc):
    """your code here"""
    n = Xp.shape[1]
    # data normalization
    x1 = Dehomogenize(Xp)
    x2 = Dehomogenize(Xc)
    x1, T1 = Normalize(x1)
    x2, T2 = Normalize(x2)

    A = np.zeros((1,16))
    for i in range(n):
        xi_1 = x1[:,i].reshape(-1,1)
        xi_2 = x2[:,i].reshape(-1,1)
        # left nullspace of x2
        v = (xi_2 + np.sign(xi_2[0])
             *np.linalg.norm(xi_2)
             *np.array([1,0,0,0]).reshape(-1,1))
        Hv = np.eye(4) - 2*((v @ v.T)/(v.T @ v))
        x_ln = Hv[1:,:]
        A = np.vstack((A, np.kron(x_ln,xi_1.T)))
        
    A = A[1:,:]
    u, s, vh = np.linalg.svd(A)
    H = vh[-1,:].reshape(4,4)
 

    # data denormalize
    H = np.linalg.inv(T2) @ H @ T1    
    H = H/np.linalg.norm(H,'fro')

    print(H)
    return H

Xp = X_projective[:,:6]
Xc = X_control
H = ComputeHomography(Xp, Xc)
X_euclidean = H @ X_projective
X_euclidean = Dehomogenize(X_euclidean)
Xe, Ye, Ze = X_euclidean[0,:], X_euclidean[1,:], X_euclidean[2,:] 
fig = plt.figure(figsize=(15, 20))
axis = fig.add_subplot(2, 1, 1, projection="3d")
axis.scatter(Xe, Ye, Ze, marker="+", s=5)
axis = fig.add_subplot(2, 1, 2, projection="3d")
axis.scatter(Ye, Xe, Ze, marker="+", s=5)
plt.show()

[[-4.05391050e-02  2.30511092e-03 -1.27864732e-03  5.31813688e-01]
 [-2.83214207e-03 -3.52507579e-02 -2.02718886e-03  7.56337034e-01]
 [-8.35206850e-05  9.21532140e-05 -3.86033901e-02  1.99655946e-02]
 [ 8.66741162e-06 -6.87943722e-06  2.88078962e-05  3.74612486e-01]]

Comment:

The resulting point cloud looks like a town house.