29

In computer vision, the transformation from 3D world coordinates to pixel coordinates is often represented by a 3x4 (3 rows by 4 cols) matrix P as detailed below. Given a camera in Blender, I need code that returns P.

Context where this shows up

It is common for people to want the reverse: to set Blender or OpenGL camera transforms from a given 3x4 P. This is so due to Augmented reality where 3x4 cameras are computed from real imagery using computer vision / structure from motion algorithms which are then used in CG to render registered synthetic models.

In my case, I am using Blender to generate synthetic data to validate a 3D structure from motion (3D reconstruction) algorithm. I need to use the synthetic Blender cameras inside a computer vision pipeline based on 3x4 P.

Some Details

An image pixel $(u,v)$ is generated from world $(x,y,z)$ coordinates through a 3x4 matrix using projective coordinates:

$$kx = PX,$$

where $k$ is a constant, $x = (u,v,1)^t$, $X = (X,Y,Z,W)^t$, and $P$ can be decomposed as

$$P=K[I|0] \begin{bmatrix} R & T\\ 0^\text{t} & 1 \end{bmatrix} = K[R|T],$$

where $K$ can be written as

$$K=\begin{bmatrix} \alpha_u & s & u_o\\ 0 & \alpha_v & v_o\\ 0 & 0 & 1 \end{bmatrix},$$

where

$$f=\text{focal length}$$ $$\alpha_u=\frac{f\times u\text{ pixels}}{\text{unit length}}=\frac{f}{\text{width of a pixel in world units}}$$ $$\alpha_v=\frac{f\times v\text{ pixels}}{\text{unit length}}=\frac{f}{\text{height of a pixel in world units}}$$ $$u_o=u\text{ coordinate of principal point}$$ $$v_o=v\text{ coordinate of principal point}$$ $$s=\text{skew factor}$$ I need to generate this from a Blender camera (e.g., from the Cycles engine).

Preliminary Research

I could certainly detail this up further in terms of all the coordinate systems and work out the solution myself if need be, but I'm looking to reuse well-tested shared code. Perhaps there could be something in libmv.

Directly related to: What is blender's camera projection matrix model?, blender camera from 3x4 matrix

Other related questions: How can I get the camera's projection matrix?, How to find image coordinates of the rendered vertex?

rfabbri
  • 1,243
  • 1
  • 14
  • 22
  • 2
    Is that 3x4 matrix you are talking about any different from a 4x4 Blender matrix with the 4th row removed? Some game engines use this kind of optimization to save some memory (CryEngine afaik). The 3x3 part contains rotation and scale information, the 4th column the location and the 4th row is rather unneeded - unless for 2D/3D transformation, because you need to devide by the value of the last cell (it's usually 1 though for 3D transformations, and the other 3 are 0). – CodeManX Sep 04 '15 at 10:40
  • @CoDEmanX, the 4x4 blender matrix you mention, e.g. camera.matrix_world, does not contain the internal camera parameters (see get_calibration_matrix_K_from_blender below). Moreover, the rotation part of the blender matrix needs to be transposed for it to represent coordinate change instad of the camera rotation, and rotated appropriatedly if the desired 3x4 matrix is to represent a camera coordinate system commonly used in computer vision where x is horizontal, y is down (to align to the actual matrix pixel coordinates) and z is positive for look-at direction. – rfabbri Sep 12 '15 at 19:24
  • @rfabbri, Thanks for your answer. How can we get the projection equation of a fisheye camera ? What is model of fisheye have been used? – BetterEnglish Oct 17 '15 at 04:34
  • @startingBlender Your comment-question merits a real question. This is related to lens distortion, which is taken into account separately, see my answer to http://blender.stackexchange.com/questions/38208/distortion-coefficients-and-camera-intrinsic-of-blenders-cameras/38460#38460 – rfabbri Oct 27 '15 at 01:23
  • 2
    Now there is a Blender add-on for Computer Vision applications that does this automatically for you: VisionBlender – João Cartucho Nov 01 '20 at 09:04

2 Answers2

32

I wrote the function get_3x4_RT_matrix_from_blender to do this, listed below.

import bpy
import bpy_extras
from mathutils import Matrix
from mathutils import Vector

#---------------------------------------------------------------

3x4 P matrix from Blender camera

#---------------------------------------------------------------

Build intrinsic camera parameters from Blender camera data

See notes on this in

blender.stackexchange.com/questions/15102/what-is-blenders-camera-projection-matrix-model

def get_calibration_matrix_K_from_blender(camd): f_in_mm = camd.lens scene = bpy.context.scene resolution_x_in_px = scene.render.resolution_x resolution_y_in_px = scene.render.resolution_y scale = scene.render.resolution_percentage / 100 sensor_width_in_mm = camd.sensor_width sensor_height_in_mm = camd.sensor_height pixel_aspect_ratio = scene.render.pixel_aspect_x / scene.render.pixel_aspect_y if (camd.sensor_fit == 'VERTICAL'): # the sensor height is fixed (sensor fit is horizontal), # the sensor width is effectively changed with the pixel aspect ratio s_u = resolution_x_in_px * scale / sensor_width_in_mm / pixel_aspect_ratio s_v = resolution_y_in_px * scale / sensor_height_in_mm else: # 'HORIZONTAL' and 'AUTO' # the sensor width is fixed (sensor fit is horizontal), # the sensor height is effectively changed with the pixel aspect ratio pixel_aspect_ratio = scene.render.pixel_aspect_x / scene.render.pixel_aspect_y s_u = resolution_x_in_px * scale / sensor_width_in_mm s_v = resolution_y_in_px * scale * pixel_aspect_ratio / sensor_height_in_mm

# Parameters of intrinsic calibration matrix K
alpha_u = f_in_mm * s_u
alpha_v = f_in_mm * s_v
u_0 = resolution_x_in_px * scale / 2
v_0 = resolution_y_in_px * scale / 2
skew = 0 # only use rectangular pixels

K = Matrix(
    ((alpha_u, skew,    u_0),
    (    0  , alpha_v, v_0),
    (    0  , 0,        1 )))
return K

Returns camera rotation and translation matrices from Blender.

There are 3 coordinate systems involved:

1. The World coordinates: "world"

- right-handed

2. The Blender camera coordinates: "bcam"

- x is horizontal

- y is up

- right-handed: negative z look-at direction

3. The desired computer vision camera coordinates: "cv"

- x is horizontal

- y is down (to align to the actual pixel coordinates

used in digital images)

- right-handed: positive z look-at direction

def get_3x4_RT_matrix_from_blender(cam): # bcam stands for blender camera R_bcam2cv = Matrix( ((1, 0, 0), (0, -1, 0), (0, 0, -1)))

# Transpose since the rotation is object rotation, 
# and we want coordinate rotation
# R_world2bcam = cam.rotation_euler.to_matrix().transposed()
# T_world2bcam = -1*R_world2bcam * location
#
# Use matrix_world instead to account for all constraints
location, rotation = cam.matrix_world.decompose()[0:2]
R_world2bcam = rotation.to_matrix().transposed()

# Convert camera location to translation vector used in coordinate changes
# T_world2bcam = -1*R_world2bcam*cam.location
# Use location from matrix_world to account for constraints:     
T_world2bcam = -1*R_world2bcam @ location

# Build the coordinate transform matrix from world to computer vision camera
# NOTE: Use * instead of @ here for older versions of Blender
# TODO: detect Blender version
R_world2cv = R_bcam2cv@R_world2bcam
T_world2cv = R_bcam2cv@T_world2bcam

# put into 3x4 matrix
RT = Matrix((
    R_world2cv[0][:] + (T_world2cv[0],),
    R_world2cv[1][:] + (T_world2cv[1],),
    R_world2cv[2][:] + (T_world2cv[2],)
     ))
return RT

def get_3x4_P_matrix_from_blender(cam): K = get_calibration_matrix_K_from_blender(cam.data) RT = get_3x4_RT_matrix_from_blender(cam) return K@RT, K, RT

----------------------------------------------------------

Alternate 3D coordinates to 2D pixel coordinate projection code

adapted from https://blender.stackexchange.com/questions/882/how-to-find-image-coordinates-of-the-rendered-vertex?lq=1

to have the y axes pointing up and origin at the top-left corner

def project_by_object_utils(cam, point): scene = bpy.context.scene co_2d = bpy_extras.object_utils.world_to_camera_view(scene, cam, point) render_scale = scene.render.resolution_percentage / 100 render_size = ( int(scene.render.resolution_x * render_scale), int(scene.render.resolution_y * render_scale), ) return Vector((co_2d.x * render_size[0], render_size[1] - co_2d.y * render_size[1]))

----------------------------------------------------------

if name == "main": # Insert your camera name here cam = bpy.data.objects['Camera.001'] P, K, RT = get_3x4_P_matrix_from_blender(cam) print("K") print(K) print("RT") print(RT) print("P") print(P)

print("==== Tests ====")
e1 = Vector((1, 0,    0, 1))
e2 = Vector((0, 1,    0, 1))
e3 = Vector((0, 0,    1, 1))
O  = Vector((0, 0,    0, 1))

p1 = P @ e1
p1 /= p1[2]
print("Projected e1")
print(p1)
print("proj by object_utils")
print(project_by_object_utils(cam, Vector(e1[0:3])))

p2 = P @ e2
p2 /= p2[2]
print("Projected e2")
print(p2)
print("proj by object_utils")
print(project_by_object_utils(cam, Vector(e2[0:3])))

p3 = P @ e3
p3 /= p3[2]
print("Projected e3")
print(p3)
print("proj by object_utils")
print(project_by_object_utils(cam, Vector(e3[0:3])))

pO = P @ O
pO /= pO[2]
print("Projected world origin")
print(pO)
print("proj by object_utils")
print(project_by_object_utils(cam, Vector(O[0:3])))

# Bonus code: save the 3x4 P matrix into a plain text file
# Don't forget to import numpy for this
nP = numpy.matrix(P)
numpy.savetxt("/tmp/P3x4.txt", nP)  # to select precision, use e.g. fmt='%.2f'

Detailed Tests

  • I included some helper functions and tests to cross-check the 3D-2D projection process with another approach using world_to_camera_view from @ideasman42's answer to How to find image coordinates of the rendered vertex? The floating point coordinates match up perfectly up to many decimal places. To get the integer pixel coordinates after projection, just round them.

  • I tested this visually and analytically with a few scenes, to make sure both routines were exactly correct on actual rendered images.

    • I represented e1, e2, e3 and O with small cone objects whose tips were at the desired positions and coded with different colors: Red for X, Green for Y, Blue for Z
    • I placed the camera at a convenient location at first, R = identity, T = (0,0,5)
    • I used default Blender parameters for the camera intrinsics
    • I computed the P matrix by hand and computed it using my rountines and they match perfectly
    • Visually comparing both the Cycles render and the OpenGL render in GIMP show that the produced coordinates match well visually (up to perhaps +- 1 pixel error, which is the sort of thing this careful testing was carried out anyways - to be investigated).
    • I then arbitrarily rotated and translated the camera such that the world e1, e2, e3 remain visible and at general locations. The projections match visually in the rendered images, and also match to the ones given by the object utils routine given above.

Here is an example of a test output:

  K
<Matrix 3x3 (2100.0000,    0.0000, 960.0000)
            (   0.0000, 2100.0000, 540.0000)
            (   0.0000,    0.0000,   1.0000)>
RT
<Matrix 3x4 (-0.0594, -0.9483, -0.3118, -0.6837)
            ( 0.6234,  0.2087, -0.7536, -0.1887)
            ( 0.7797, -0.2391,  0.5787,  4.2599)>
P
<Matrix 3x4 ( 623.8236, -2220.9482,   -99.1637, 2653.7441)
            (1730.0916,   309.2392, -1269.9426, 1904.1343)
            (   0.7797,    -0.2391,     0.5787,    4.2599)>
==== Tests ====
Projected e1
<Vector (650.3716, 721.1436, 1.0000)>
proj by object_utils
<Vector (650.3716, 721.1436)>
Projected e2
<Vector (107.6402, 550.4857, 1.0000)>
proj by object_utils
<Vector (107.6403, 550.4857)>
Projected e3
<Vector (527.9586, 131.0692, 1.0000)>
proj by object_utils
<Vector (527.9586, 131.0693)>
Projected world origin
<Vector (622.9655, 446.9949, 1.0000)>
proj by object_utils
<Vector (622.9656, 446.9949)>

Here is the test set with the blender rotation and location for the camera. the blender extrisics setting for a test (intrinsics are the default camera parameters when adding a camera) Here is the OpenGL render. The coordinates output above match the tip of the cones visually (up to +- 1 px error, perhaps due to aliasing). the OpenGL render to certify that the pixel coordinates at the tip of the cones match the ones provided by the constructed 3x4 P matrix

Limitations: this code currently does not support certain intrinsic camera parameter configurations, see my answer to What is blender's camera projection matrix model? for a list of the limitations to get_calibration_matrix_K_from_blender.

Object coordinates: to deal with object coordinates, first transform to world coordinates, see my answer to How to find image coordinates of the rendered vertex?

Matrix multiplication used to be * before Blender 2.8, newer versions use @ instead

rfabbri
  • 1,243
  • 1
  • 14
  • 22
  • Where I can find information details about the 3 coordinate system involved to get the external parameters in blender documentation? – BetterEnglish Sep 15 '15 at 15:35
  • @startingBlender just figure it out from basic facts: 1) look at the blender xyz glyphs - they show you blender uses a right-handed coordinate system 2) in graphics an image's coordinates's x is horizontal, y vertical pointing up, and the origin is on the bottom-left corner 3) Since the camera's coordinate system must be right-handed to be expressible as a rotation and translation from the right-handed world coordinates, the camera look-at direction must be negative z. The third coordinate system "cv" is up to you; just keep this convention in mind when interpreting projected coordinates. – rfabbri Oct 30 '15 at 21:26
  • Thanks! Do you have any idea about the projection model of the panoramic camera with fisheye equisolid lens ? – BetterEnglish Oct 31 '15 at 00:40
  • @startingBlender please ask that in a new question – rfabbri Oct 31 '15 at 04:01
  • I have asked this question http://blender.stackexchange.com/questions/40702/how-can-we-get-the-projection-matrix-of-the-panoramic-camera-with-fisheye-equiso – BetterEnglish Oct 31 '15 at 11:50
  • Thanks a lot for this wonderful answer. I am aware that this post was long before, yet I cant help wondering how you could figure out the coordinate system of blender camera? I was trying to get that information confirmed when I came across this post of yours. – AugLe Jun 08 '17 at 13:47
  • Can you also show how we can get the same information for 2 cameras in a stereoscopic pair? I have asked my question at https://blender.stackexchange.com/questions/81100/stereo-camera-extrinsic-parameter – AugLe Jun 09 '17 at 16:34
  • Helpful post, but in get_calibration_matrix_K_from_blender(), u_0 and v_0 needs to take camd.shift_x and camd.shift_y into account. – iNFINITEi Oct 20 '17 at 09:02
  • You also have to import Vector from mathutils – WhatAMesh Aug 29 '18 at 14:15
  • 2
    Thanks for the script! Unfortunately there seems to be an issue with it regarding the sensor fit. It produces the wrong values for me no matter whether the fit is AUTO, HORIZONTAL or VERTICAL. If I set alpha_v = alpha_u it works for the AUTO and HORIZONTAL fits. If I set alpha_u = alpha_v it works for the VERTICAL fit. – Daniel Oct 08 '18 at 09:08
  • TypeError: unsupported operand type(s) for @: 'Matrix' and 'Vector' –  Mar 30 '20 at 13:06
  • @HaozheXie someone updated the code for the latest Blender. Use * instead of @ for your Blender version. – rfabbri Mar 30 '20 at 13:48
  • @KarimSherifi This answer was originally written for 2.79 so I think it's important it remains compatible with 2.79. Maybe it would be better to add a note about the change in syntax for 2.8 or to add another code block for the 2.8 version. – Ray Mairlot May 12 '20 at 14:40
  • @rfabbri Probably misunderstanding something here but doesn't Blender have a z-up coordinate system? Your comments say it is y-up, regarding Blender camera coordinates "bcam". Or does it not matter in this case? – chronosynclastic Nov 03 '21 at 16:43
  • I guess I was mixing up the global and view orientations of Blender, see here – chronosynclastic Nov 11 '21 at 13:25
  • @rfabbri The mapped 2d points are divided by the last element, p3 /= p3[2], what is this last element? Depth perhaps? – June Wang Dec 31 '21 at 05:19
17

rfabbri's excellent answer works in many cases but not always. Based on his version I created a new one that (at least in my tests) works in all cases, including portrait and landscape pixel aspect ratios, automatic, horizontal and vertical sensor fitting as well as principal point offset (shift).

I had to dig a bit through Blender's source code to figure this out but it seems to work so far:

import bpy
from mathutils import Matrix, Vector

#---------------------------------------------------------------

3x4 P matrix from Blender camera

#---------------------------------------------------------------

BKE_camera_sensor_size

def get_sensor_size(sensor_fit, sensor_x, sensor_y): if sensor_fit == 'VERTICAL': return sensor_y return sensor_x

BKE_camera_sensor_fit

def get_sensor_fit(sensor_fit, size_x, size_y): if sensor_fit == 'AUTO': if size_x >= size_y: return 'HORIZONTAL' else: return 'VERTICAL' return sensor_fit

Build intrinsic camera parameters from Blender camera data

See notes on this in

blender.stackexchange.com/questions/15102/what-is-blenders-camera-projection-matrix-model

as well as

https://blender.stackexchange.com/a/120063/3581

def get_calibration_matrix_K_from_blender(camd): if camd.type != 'PERSP': raise ValueError('Non-perspective cameras not supported') scene = bpy.context.scene f_in_mm = camd.lens scale = scene.render.resolution_percentage / 100 resolution_x_in_px = scale * scene.render.resolution_x resolution_y_in_px = scale * scene.render.resolution_y sensor_size_in_mm = get_sensor_size(camd.sensor_fit, camd.sensor_width, camd.sensor_height) sensor_fit = get_sensor_fit( camd.sensor_fit, scene.render.pixel_aspect_x * resolution_x_in_px, scene.render.pixel_aspect_y * resolution_y_in_px ) pixel_aspect_ratio = scene.render.pixel_aspect_y / scene.render.pixel_aspect_x if sensor_fit == 'HORIZONTAL': view_fac_in_px = resolution_x_in_px else: view_fac_in_px = pixel_aspect_ratio * resolution_y_in_px pixel_size_mm_per_px = sensor_size_in_mm / f_in_mm / view_fac_in_px s_u = 1 / pixel_size_mm_per_px s_v = 1 / pixel_size_mm_per_px / pixel_aspect_ratio

# Parameters of intrinsic calibration matrix K
u_0 = resolution_x_in_px / 2 - camd.shift_x * view_fac_in_px
v_0 = resolution_y_in_px / 2 + camd.shift_y * view_fac_in_px / pixel_aspect_ratio
skew = 0 # only use rectangular pixels

K = Matrix(
    ((s_u, skew, u_0),
    (   0,  s_v, v_0),
    (   0,    0,   1)))
return K

Returns camera rotation and translation matrices from Blender.

There are 3 coordinate systems involved:

1. The World coordinates: "world"

- right-handed

2. The Blender camera coordinates: "bcam"

- x is horizontal

- y is up

- right-handed: negative z look-at direction

3. The desired computer vision camera coordinates: "cv"

- x is horizontal

- y is down (to align to the actual pixel coordinates

used in digital images)

- right-handed: positive z look-at direction

def get_3x4_RT_matrix_from_blender(cam): # bcam stands for blender camera R_bcam2cv = Matrix( ((1, 0, 0), (0, -1, 0), (0, 0, -1)))

# Transpose since the rotation is object rotation, 
# and we want coordinate rotation
# R_world2bcam = cam.rotation_euler.to_matrix().transposed()
# T_world2bcam = -1*R_world2bcam @ location
#
# Use matrix_world instead to account for all constraints
location, rotation = cam.matrix_world.decompose()[0:2]
R_world2bcam = rotation.to_matrix().transposed()

# Convert camera location to translation vector used in coordinate changes
# T_world2bcam = -1*R_world2bcam @ cam.location
# Use location from matrix_world to account for constraints:     
T_world2bcam = -1*R_world2bcam @ location

# Build the coordinate transform matrix from world to computer vision camera
R_world2cv = R_bcam2cv@R_world2bcam
T_world2cv = R_bcam2cv@T_world2bcam

# put into 3x4 matrix
RT = Matrix((
    R_world2cv[0][:] + (T_world2cv[0],),
    R_world2cv[1][:] + (T_world2cv[1],),
    R_world2cv[2][:] + (T_world2cv[2],)
    ))
return RT

def get_3x4_P_matrix_from_blender(cam): K = get_calibration_matrix_K_from_blender(cam.data) RT = get_3x4_RT_matrix_from_blender(cam) return K@RT, K, RT

----------------------------------------------------------

if name == "main": # Insert your camera name here cam = bpy.data.objects['Camera'] P, K, RT = get_3x4_P_matrix_from_blender(cam) print("K") print(K) print("RT") print(RT) print("P") print(P)

print(&quot;==== 3D Cursor projection ====&quot;)
pc = P @ bpy.context.scene.cursor.location
pc /= pc[2]
print(&quot;Projected cursor location&quot;)
print(pc)

# Bonus code: save the 3x4 P matrix into a plain text file
# Don't forget to import numpy for this
#nP = numpy.matrix(P)
#numpy.savetxt(&quot;/tmp/P3x4.txt&quot;, nP)  # to select precision, use e.g. fmt='%.2f'

Note that the sensor fit returned by get_sensor_fit is intentionally not used as the first parameter in the call to get_sensor_size. This is how Blender's source code does it and this is where the difference between VERTICAL and AUTO fits comes from.

Daniel
  • 271
  • 2
  • 6
  • Do you know how to set a blender camera to match the camera intrinsic of a real camera? – Meta Fan Apr 02 '19 at 14:51
  • 1
    A life saver! The sensor size problem in rfabbri's code was giving me trouble, and your modifications fixed it. Thanks! – RaveTheTadpole Aug 18 '19 at 07:29
  • I think there is a mistake in your script, which I found while rendering pictures for testing algorithms for stereoscopy. If you set the camera to a preset, e.g. GoPro Hero3 Black, the box for sensor size height does not show up and it appears, that it is not used. alpha_v equals alpha_u in this case. At least I get the right results this way. – qwert wayne May 03 '17 at 09:00
  • @Daniel Is there any mistake in the script? I am trying to verify my results obtained from your approach along with the results obtained from Blender's world_to_camera_view() and the results seem to be different. – Tejus Jun 14 '20 at 09:58
  • 1
    @Tejus I just checked with the latest Blender version and it still seems to work. Some points to note though: The script uses "Computer Vision" conventions, so the coordinates you get will be between (0, 0) and (resolution_x, resolution_y) for points that project onto the image plane. Note that the x axis goes right, the y axis goes down. Blender's world_to_camera_view() will give values between 0 and 1, so you need to multiply those with C.scene.render.resolution_x or C.scene.render.resolution_y and C.scene.render.resolution_percentage. Also, Blender's y goes up. – Daniel Jun 14 '20 at 11:06
  • 1
    @Daniel I'm sorry, but your implementation is spot on. You're right about world_to_camera_view() giving values about 0 and 1. But its implementation in Blender versions <2.83 is wrong. They don't account for any horizontal/vertical shift in the camera. This issue has been addressed and fixed in the latest version (https://developer.blender.org/T74577). – Tejus Jun 14 '20 at 16:02