I would approach it by generating an image like so:
import bpy
temp_name = "temp.material.{}"
step = 10
scene_name = "Scene"
def convert(c):
# https://blender.stackexchange.com/a/158902/60486
c /= 255
if c < 0.04045:
return c / 12.92
else:
return ((c+0.055)/1.055)**2.4
color_generator = (
(r, g, b, 1)
for r in range(0, 256, step)
for g in range(0, 256, step)
for b in range(0, 256, step)
)
def new_material(num, color):
mat = bpy.data.materials.new(name = temp_name.format(num))
mat.use_nodes = True
nt = mat.node_tree
rgb_node = nt.nodes.new('ShaderNodeRGB')
nt.nodes.remove(nt.nodes['Principled BSDF'])
nt.links.new(nt.nodes['Material Output'].inputs['Surface'], rgb_node.outputs['Color'])
rgb_node.outputs['Color'].default_value = tuple(map(convert, color))
return mat
prev_mats_dic = {}
i = 0
for o in bpy.data.objects:
if o.type == 'MESH':
prev_mats = []
color = next(color_generator)
print(o.name, color)
new_mat = new_material(i, color)
for mat_slot in o.material_slots:
prev_mats.append(mat_slot.material)
mat_slot.material = new_mat
prev_mats_dic[o] = prev_mats
i += 1
view_settings = bpy.data.scenes[scene_name].view_settings
old_view_transform = view_settings.view_transform
view_settings.view_transform = 'Standard'
bpy.ops.render.render(write_still = True)
view_settings.view_transform = old_view_transform
for o, mats in prev_mats_dic.items():
for i, mat_slot in enumerate(o.material_slots):
temp = mat_slot.material
mat_slot.material = mats[i]
bpy.data.materials.remove(temp)
This script assigns temporary colors to objects and prints them to the system console together with object names; it also renders an image where the objects have those colors without shading. Now you can read this image, and divide each component of the RGB by the step (here 10) and then round it to the nearest integer (as on the rendered image there is slight +/- 2 variance). Then you multiply those components by 10 to get the flattened colors if you want to save the image, or you can just divide the colors from the console by 10, so they match. Either way you end up with information for each pixel on what object is displayed there, though I would expect some pixels to be exactly on an edge of two objects - it's probably best to render at high resolution, 1 sample (for no anti-aliasing).