The line in 3D defined by the join of the points $\boldsymbol{X}_1 = (X_1, Y_1, Z_1, T_1)^\top$ and $\boldsymbol{X}_2 = (X_2, Y_2, Z_2, T_2)^\top$ can be represented as a Plucker matrix $\boldsymbol{L} = \boldsymbol{X}_1 \boldsymbol{X}_2^\top - \boldsymbol{X}_2 \boldsymbol{X}_1^\top$ or pencil of points $\boldsymbol{X}(\lambda) = \lambda \boldsymbol{X}_1 + (1 - \lambda) \boldsymbol{X}_2$ (i.e., $\boldsymbol{X}$ is a function of $\lambda$). The line intersects the plane $\boldsymbol{\pi} = (a, b, c, d)^\top$ at the point $\boldsymbol{X}_{\boldsymbol{L}} = \boldsymbol{L} \boldsymbol{\pi}$ or $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})$, where $\lambda_{\boldsymbol{\pi}}$ is determined such that $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})^\top \boldsymbol{\pi} = 0$ (i.e., $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})$ is the point on $\boldsymbol{\pi}$). Show that $\boldsymbol{X}_{\boldsymbol{L}}$ is equal to $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})$ up to scale.
Answer:
We know $\boldsymbol{X}_{\boldsymbol{L}}$ is a point in plane $\boldsymbol{\pi} = (a, b, c, d)^\top$, so $\boldsymbol{X}_{\boldsymbol{L}}^\top\boldsymbol{\pi}=0$.
$$\boldsymbol{X}_{\boldsymbol{L}} = \boldsymbol{L} \boldsymbol{\pi}$$$$\boldsymbol{X}_{\boldsymbol{L}}^\top = \boldsymbol{\pi}^\top \boldsymbol{L}^\top =\boldsymbol{\pi}^\top (\boldsymbol{X}_1\boldsymbol{X}_2^\top - \boldsymbol{X}_2 \boldsymbol{X}_1^\top)^\top =\boldsymbol{\pi}^\top (\boldsymbol{X}_2\boldsymbol{X}_1^\top - \boldsymbol{X}_1 \boldsymbol{X}_2^\top) =\boldsymbol{\pi}^\top \boldsymbol{X}_2\boldsymbol{X}_1^\top - \boldsymbol{\pi}^\top \boldsymbol{X}_1 \boldsymbol{X}_2^\top$$Therefore, $$\boldsymbol{X}_{\boldsymbol{L}}^\top\boldsymbol{\pi}=(\boldsymbol{\pi}^\top \boldsymbol{X}_2\boldsymbol{X}_1^\top - \boldsymbol{\pi}^\top \boldsymbol{X}_1 \boldsymbol{X}_2^\top)\boldsymbol{\pi} =((\boldsymbol{\pi}^\top \boldsymbol{X}_2)\boldsymbol{X}_1^\top - (\boldsymbol{\pi}^\top \boldsymbol{X}_1) \boldsymbol{X}_2^\top)\boldsymbol{\pi} = 0$$
We know $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})$ is the point on $\boldsymbol{\pi}$, and $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})^\top \boldsymbol{\pi} = 0$.
$$\boldsymbol{X}(\lambda_{\boldsymbol{\pi}}) = \lambda_{\boldsymbol{\pi}} \boldsymbol{X}_1 + (1 - \lambda_{\boldsymbol{\pi}}) \boldsymbol{X}_2$$$$\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})^\top = \lambda_{\boldsymbol{\pi}} \boldsymbol{X}_1^\top + (1 - \lambda_{\boldsymbol{\pi}}) \boldsymbol{X}_2^\top$$Therefore, $$\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})^\top \boldsymbol{\pi} = (\lambda_{\boldsymbol{\pi}} \boldsymbol{X}_1^\top + (1 - \lambda_{\boldsymbol{\pi}}) \boldsymbol{X}_2^\top)\boldsymbol{\pi} = 0$$
Let $(\boldsymbol{\pi}^\top \boldsymbol{X}_2) = s_0\lambda_{\boldsymbol{\pi}}$ and $(-\boldsymbol{\pi}^\top \boldsymbol{X}_1) = s_1(1 - \lambda_{\boldsymbol{\pi}})$, $s_0$ and $s_1$ are some constants. Then,
$$s_0\lambda_{\boldsymbol{\pi}}= aX_2 + bY_2 + cZ_2 + dT_2$$$$\lambda_{\boldsymbol{\pi}} = \frac{aX_2 + bY_2 + cZ_2 + dT_2}{s_0}$$$$s_1(1 - \lambda_{\boldsymbol{\pi}}) = -aX_1 - bY_1 - cZ_1 - dT_1$$$$\lambda_{\boldsymbol{\pi}} = \frac{aX_1 + bY_1 + cZ_1 + dT_1 + s_1}{s_1}$$Let $$\frac{aX_1 + bY_1 + cZ_1 + dT_1 + s_1}{s_1} = \frac{aX_2 + bY_2 + cZ_2 + dT_2}{s_0}$$ solve for $s_0$ and $s_1$, $$ s_0 = s_1 = a(X_2-X_1) + b(Y_2-Y_1) + c(Z_2-Z_1) + d(T_2-T_1) = s$$ Finally, we have
$$\boldsymbol{X}_{\boldsymbol{L}} = \boldsymbol{L} \boldsymbol{\pi} = \boldsymbol{X}_1 \boldsymbol{X}_2^\top\boldsymbol{\pi} - \boldsymbol{X}_2 \boldsymbol{X}_1^\top\boldsymbol{\pi} = s\lambda_{\boldsymbol{\pi}}\boldsymbol{X}_1 + s(1 - \lambda_{\boldsymbol{\pi}})\boldsymbol{X}_2 = s(\lambda_{\boldsymbol{\pi}}\boldsymbol{X}_1 + (1 - \lambda_{\boldsymbol{\pi}})\boldsymbol{X}_2) = s\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})$$Therefore, $\boldsymbol{X}_{\boldsymbol{L}}$ is equal to $\boldsymbol{X}(\lambda_{\boldsymbol{\pi}})$ up to scale $s$.
In general, a line in 3D intersects a quadric $\boldsymbol{Q}$ at zero, one (if the line is tangent to the quadric), or two points. If the pencil of points $\boldsymbol{X}(\lambda) = \lambda \boldsymbol{X}_1 + (1 - \lambda) \boldsymbol{X}_2$ represents a line in 3D, the (up to two) real roots of the quadratic polynomial $c_2 \lambda_{\boldsymbol{Q}}^2 + c_1 \lambda_{\boldsymbol{Q}} + c_0 = 0$ are used to solve for the intersection point(s) $\boldsymbol{X}(\lambda_{\boldsymbol{Q}})$. Show that $c_2 = \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_1 - 2 \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_2 + \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2$, $c_1 = 2 ( \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_2 - \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2 )$, and $c_0 = \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2$.
Answer:
The line $\boldsymbol{X}(\lambda)$ intersects a quadric $\boldsymbol{Q}$ at points $\boldsymbol{X}(\lambda_{\boldsymbol{Q}})$. So we have, $$\boldsymbol{X}(\lambda_{\boldsymbol{Q}})^\top\boldsymbol{Q}\boldsymbol{X}(\lambda_{\boldsymbol{Q}}) = 0$$
And $\boldsymbol{X}(\lambda_Q) = \lambda_Q \boldsymbol{X}_1 + (1 -\lambda_Q) \boldsymbol{X}_2$, So we get,
$$(\lambda_Q \boldsymbol{X}_1 + (1 -\lambda_Q) \boldsymbol{X}_2)^\top\boldsymbol{Q}(\lambda_Q \boldsymbol{X}_1 + (1 -\lambda_Q) \boldsymbol{X}_2) = 0$$$$\lambda_Q^2\boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_1 + (\lambda_Q - \lambda_Q^2)\boldsymbol{X}_1^\top \boldsymbol{Q}\boldsymbol{X}_2 + (\lambda_Q - \lambda_Q^2)\boldsymbol{X}_2^\top \boldsymbol{Q}\boldsymbol{X}_1 + (1-2\lambda_Q + \lambda_Q^2)\boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2 = 0$$$$\lambda_Q^2(\boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_1 - 2 \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_2 + \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2) + 2\lambda_Q(\boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_2 - \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2) + \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2= 0$$$$c_2 \lambda_{\boldsymbol{Q}}^2 + c_1\lambda_{\boldsymbol{Q}} + c_0 = 0$$Therefore, we get $c_2 = \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_1 - 2 \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_2 + \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2$, $c_1 = 2 ( \boldsymbol{X}_1^\top \boldsymbol{Q} \boldsymbol{X}_2 - \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2 )$, and $c_0 = \boldsymbol{X}_2^\top \boldsymbol{Q} \boldsymbol{X}_2$.
Download input data from the course website. The file hw2_points3D.txt contains the coordinates of 50 scene points in 3D (each line of the file gives the $\tilde{X}_i$, $\tilde{Y}_i$, and $\tilde{Z}_i$ inhomogeneous coordinates of a point). The file hw2_points2D.txt contains the coordinates of the 50 corresponding image points in 2D (each line of the file gives the $\tilde{x}_i$ and $\tilde{y}_i$ inhomogeneous coordinates of a point). The scene points have been randomly generated and projected to image points under a camera projection matrix (i.e., $\boldsymbol{x}_i = \boldsymbol{P} \boldsymbol{X}_i$), then noise has been added to the image point coordinates.
Estimate the camera projection matrix $\boldsymbol{P}_\text{DLT}$ using the direct linear transformation (DLT) algorithm (with data normalization). You must express $\boldsymbol{x}_i = \boldsymbol{P} \boldsymbol{X}_i$ as $[\boldsymbol{x}_i]^\perp \boldsymbol{P} \boldsymbol{X}_i = \boldsymbol{0}$ (not $\boldsymbol{x}_i \times \boldsymbol{P} \boldsymbol{X}_i = \boldsymbol{0}$), where $[\boldsymbol{x}_i]^\perp \boldsymbol{x}_i = \boldsymbol{0}$, when forming the solution. Return $\boldsymbol{P}_\text{DLT}$, scaled such that $||\boldsymbol{P}_\text{DLT}||_\text{Fro} = 1$
The following helper functions may be useful in your DLT function implementation. You are welcome to add any additional helper functions.
import numpy as np
import time
def Homogenize(x):
# converts points from inhomogeneous to homogeneous coordinates
return np.vstack((x,np.ones((1,x.shape[1]))))
def Dehomogenize(x):
# converts points from homogeneous to inhomogeneous coordinates
return x[:-1]/x[-1]
def Normalize(pts):
# data normalization of n dimensional pts
#
# Input:
# pts - is in inhomogeneous coordinates
# Outputs:
# pts - data normalized points
# T - corresponding transformation matrix
"""your code here"""
homo_pts = Homogenize(pts)
mean = np.mean(pts,axis=1)
# for 2D points
if pts.shape[0] == 2:
s = (2/np.sum(np.var(pts,axis=1)))**0.5
T = np.array([[s,0,-mean[0]*s],
[0,s,-mean[1]*s],
[0,0,1]])
# for 3D points
if pts.shape[0] == 3:
s = (3/np.sum(np.var(pts,axis=1)))**0.5
T = np.array([[s, 0, 0, -mean[0]*s],
[0, s, 0, -mean[1]*s],
[0, 0, s, -mean[2]*s],
[0, 0, 0, 1]])
pts = T @ homo_pts
return pts, T
def ComputeCost(P, x, X):
# Inputs:
# x - 2D inhomogeneous image points
# X - 3D inhomogeneous scene points
#
# Output:
# cost - Total reprojection error
n = x.shape[1]
covarx = np.eye(2*n)
"""your code here"""
homo_X = Homogenize(X)
proj_x = P @ homo_X
dehomo_x = Dehomogenize(proj_x)
#cost = np.sum((x - dehomo_x)**2)
e = (x - dehomo_x).reshape(-1,1)
cost = e.T @ covarx @ e
return cost
def DLT(x, X, normalize=True):
# Inputs:
# x - 2D inhomogeneous image points
# X - 3D inhomogeneous scene points
# normalize - if True, apply data normalization to x and X
#
# Output:
# P - the (3x4) DLT estimate of the camera projection matrix
P = np.eye(3,4)+np.random.randn(3,4)/10
# data normalization
if normalize:
x, T = Normalize(x)
X, U = Normalize(X)
else:
x = Homogenize(x)
X = Homogenize(X)
"""your code here"""
A = np.zeros((1,12))
for i in range(x.shape[1]):
xi = x[:,i].reshape(3,1)
Xi = X[:,i].reshape(4,1)
v = xi + np.sign(xi[0])*np.linalg.norm(xi)*np.array([1,0,0]).reshape(3,1)
Hv = np.eye(3) - 2*((v @ v.T)/(v.T @ v))
x_ln = Hv[1:,:]
A = np.vstack((A, np.kron(x_ln,Xi.T)))
A = A[1:,:]
u, s, vh = np.linalg.svd(A)
P = vh[-1,:].reshape(3,4)
# data denormalize
if normalize:
P = np.linalg.inv(T) @ P @ U
return (P/np.linalg.norm(P)*np.sign(P[-1,-1]))
def displayResults(P, x, X, title):
print(title+' =')
print (P/np.linalg.norm(P)*np.sign(P[-1,-1]))
# load the data
x=np.loadtxt('hw2_points2D.txt').T
X=np.loadtxt('hw2_points3D.txt').T
# compute the linear estimate without data normalization
print ('Running DLT without data normalization')
time_start=time.time()
P_DLT = DLT(x, X, normalize=False)
cost = ComputeCost(P_DLT, x, X)
time_total=time.time()-time_start
# display the results
print('took %f secs'%time_total)
print('Cost=%.9f'%cost)
# compute the linear estimate with data normalization
print ('Running DLT with data normalization')
time_start=time.time()
P_DLT = DLT(x, X, normalize=True)
cost = ComputeCost(P_DLT, x, X)
time_total=time.time()-time_start
# display the results
print('took %f secs'%time_total)
print('Cost=%.9f'%cost)
# Report your P_DLT value here!
displayResults(P_DLT, x, X, 'P_DLT')
Use $\boldsymbol{P}_\text{DLT}$ as an initial estimate to an iterative estimation method, specifically the Levenberg-Marquardt algorithm, to determine the Maximum Likelihood estimate of the camera projection matrix that minimizes the projection error. You must parameterize the camera projection matrix as a parameterization of the homogeneous vector $\boldsymbol{p} = vec{(\boldsymbol{P}^\top)}$. It is highly recommended to implement a parameterization of homogeneous vector method where the homogeneous vector is of arbitrary length, as this will be used in following assignments.
Report the initial cost (i.e. cost at iteration 0) and the cost at the end of each successive iteration. Show the numerical values for the final estimate of the camera projection matrix $\boldsymbol{P}_\text{LM}$, scaled such that $||\boldsymbol{P}_\text{LM}||_\text{Fro} = 1$.
The following helper functions may be useful in your LM function implementation. You are welcome to add any additional helper functions.
Hint: LM has its biggest cost reduction after the 1st iteration. You'll know if you are implementing LM correctly if you experience this.
# Note that np.sinc is different than defined in class
def Sinc(x):
# Returns a scalar valued sinc value
"""your code here"""
if x == 0:
y = 1
else :
y = np.sin(x)/x
return y
def d_sinc(x):
if x == 0:
return 0
else:
return (np.cos(x)/(x)-np.sin(x)/(x**2))
def d_V_v(v):
# v is parametrized vector
v = v.reshape(-1,1)
n = v.shape[0]
if np.linalg.norm(v) == 0:
dV_v = np.concatenate([np.zeros((1,n)),np.eye(n)*0.5],axis = 0)
else:
v_norm = np.linalg.norm(v)
a = np.cos(v_norm/2).reshape(-1,1)
b = (Sinc(v_norm/2)/2)*v
b = b.reshape(-1,1)
db_v = (Sinc(v_norm/2)/2)*np.eye(n)+ 1/(4*v_norm)*d_sinc(v_norm/2)*(v @ v.T)
dV_v = np.concatenate([b.T*(-0.5),db_v],axis = 0)
return dV_v
def Jacobian(P,p,X):
# compute the jacobian matrix
#
# Input:
# P - 3x4 projection matrix
# p - 11x1 homogeneous parameterization of P
# X - 3n 3D scene points
# Output:
# J - 2nx11 jacobian matrix
#J = np.zeros((2*X.shape[1],11))
"""your code here"""
A = np.zeros((1,11))
x_hat = P @ X
x_hat = Dehomogenize(x_hat)
for i in range(x.shape[1]):
xi_hat = x_hat[:,i]
Xi = X[:,i].reshape(-1,1)
w = P[2,:].reshape(1,-1) @ Xi
dx_hat_P = (1/w)*np.concatenate([np.concatenate([Xi.T,
np.zeros((1,4)),
-xi_hat[0]*Xi.T],axis = 1),
np.concatenate([np.zeros((1,4)),
Xi.T,
-xi_hat[1]*Xi.T],axis = 1)],
axis = 0)
dP_p = d_V_v(p)
Ai = np.dot(dx_hat_P, dP_p)
A = np.vstack((A,Ai))
#print(np.linalg.norm(p))
#print(dP_p)
J = A[1:,:]
return J
def Parameterize(P):
# wrapper function to interface with LM
# takes all optimization variables and parameterizes all of them
# in this case it is just P, but in future assignments it will
# be more useful
return ParameterizeHomog(P.reshape(-1,1))
def Deparameterize(p):
# Deparameterize all optimization variables
return DeParameterizeHomog(p).reshape(3,4)
def ParameterizeHomog(V):
# Given a homogeneous vector V return its minimal parameterization
"""your code here"""
V = V/np.linalg.norm(V)
V = V.reshape(-1,1)
a = V[0]
b = V[1:]
v_hat = (2/(Sinc(np.arccos(a))))*b
if np.linalg.norm(v_hat) > np.pi:
v_hat = (1 - 2*np.pi/np.linalg.norm(v_hat)*
np.ceil((np.linalg.norm(v_hat)-np.pi)/(2*np.pi)))*v_hat
return v_hat
def DeParameterizeHomog(v):
# Given a parameterized homogeneous vector return its deparameterization
"""your code here"""
v_norm = np.linalg.norm(v)
a = np.cos(v_norm/2).reshape(-1,1)
b = (Sinc(v_norm/2)/2)*v
b = b.reshape(-1,1)
v_bar = np.vstack((a,b))
return v_bar
def LM(P, x, X, max_iters, lam):
# Input:
# P - initial estimate of P
# x - 2D inhomogeneous image points
# X - 3D inhomogeneous scene points
# max_iters - maximum number of iterations
# lam - lambda parameter
# Output:
# P - Final P (3x4) obtained after convergence
# data normalization
x, T = Normalize(x)
X, U = Normalize(X)
"""your code here"""
P = T @ P @ np.linalg.inv(U)
sigma = np.eye(2*x.shape[1])*(T[0,0]**2) # 2n*2n
# step 1
err = Dehomogenize(x) - Dehomogenize(P @ X)
err = err.reshape(-1,1,order = 'F')
# you may modify this so long as the cost is computed
# at each iteration
iters = 0
for i in range(max_iters):
# step 2
P = P/np.linalg.norm(P)
p = Parameterize(P)
J = Jacobian(P,p,X)
# step 3
norm_U = J.T @ np.linalg.inv(sigma) @ J
E = J.T @ np.linalg.inv(sigma) @ err
# step 4
flag = 1
while flag:
delta = np.linalg.solve((norm_U + lam*np.eye(11)),E)
# step 5
p0 = p + delta
# step 6
P0 = Deparameterize(p0)
x0 = P0 @ X
err0 = Dehomogenize(x) - Dehomogenize(x0)
err0 = err0.reshape(-1,1,order = 'F')
# step 7
cost = err.T @ np.linalg.inv(sigma) @ err
cost0 = err0.T @ np.linalg.inv(sigma) @ err0
if cost0 < cost:
P = P0
err = err0
lam = 0.1 * lam
iters = iters + 1
flag = 0
else :
lam = 10 * lam
print ('iter %03d Cost %.9f'%(iters, cost0))
# terminate condition
if 1e-7 > (1 - (cost0 / cost)):
break
# data denormalization
P = np.linalg.inv(T) @ P @ U
return P
# LM hyperparameters
lam = .001
max_iters = 100
# Run LM initialized by DLT estimate with data normalization
print ('Running LM with data normalization')
print ('iter %03d Cost %.9f'%(0, cost))
time_start=time.time()
P_LM = LM(P_DLT, x, X, max_iters, lam)
time_total=time.time()-time_start
print('took %f secs'%time_total)
# Report your P_LM final value here!
displayResults(P_LM, x, X, 'P_LM')