Philipp Gross

Turn your selfie into a LEGO® brick model

Use volumetric regression networks to convert a photo of your face into a 3D voxel model, and then apply stochastic optimization to create LEGO® build layouts.

A few weeks ago, we had the idea to make an app that allows the users to scan an object with their smartphone and convert the photos to a 3D model that can be built with LEGO® bricks. In the following we describe the computer vision and machine learning technologies that were involved in this experiment.

3D reconstruction with volumetric regression networks

Given a series of 2D views of an object as input and mapping it onto a 3D model as output is a classical problem in computer vision also known as Multi View Stereo Reconstruction (MVS). Every solution makes different kinds of assumptions, the most prominent one is scene rigidity, which means that no moving or deforming objects are present within the scene of interest. Other assumptions, which are hard to come by, include the material, intrinsic camera geometry, camera location, camera orientation and light conditions. If these are not known, the problem is ill-posed since multiple combinations can produce exactly the same photographs. In general, the reconstruction requires complex pipelines and solving non-convex optimization-problems.

With the recent advent of deep learning techniques in 3D reconstruction, a promising approach to solving problems like this is to train deep neural networks (DNN). Given a large amount of training data these algorithms have been quite successful in a variety of computer vision applications, including image classification and face detection.

Since 3D reconstruction is in general a difficult problem, we decided to reduce the object category to a category which has been extensively studied before, and which is fun to play with. In 2017 Aaron Jackson et al. published an impressive article 1 where they introduced Volumetric Regression Networks (VRNs) and applied them to face reconstruction. They showed that a CNN can learn directly, in an end-to-end fashion, the mapping from image pixels to the full 3D facial structure geometry (including the non-visible facial parts) with just a single 2D facial image.

vrn network The proposed VRN is a CNN architecture based on two stacked hourglass networks, which use skip connections and residual larning. It accepts as input an RGB input of shape (3, 192, 192) and directly regresses a 3D volume of shape (200, 192, 192). Each rectangle is a residual module of 256 features. (© Aaron Jackson et al.).

Generously, Jackson et al. also published their code code and a demo based on Torch7. Additionally, Paul Lorenz was so kind to contribute the transfer of the pre-trained VRN model to Keras/Tensorflow with his vrn-torch-to-keras project. This makes loading the model quite simple:

import tensorflow as tf
from tensorflow.core.framework import graph_pb2

def load_model(path, sess):
    with open(path, "rb") as f:
        output_graph_def = graph_pb2.GraphDef()
        output_graph_def.ParseFromString(f.read())
        _ = tf.import_graph_def(output_graph_def, name="")
    x = sess.graph.get_tensor_by_name('input_1:0')
    y = sess.graph.get_tensor_by_name('activation_274/Sigmoid:0')
    return x, y

sess = tf.Session()
model = load_model('vrn-tensorflow.pb', sess)

We load an input image with Pillow and Numpy:

from PIL import Image as pil_image
import numpy as np

def load_image(f):
    img = pil_image.open(f)
    img = img.resize((192, 192), pil_image.NEAREST)
    img = np.asarray(img, dtype=np.float32)
    # The shape is (192, 192, 3), i.e. channels-last order.
    return img

You should only use quadratic images, otherwise the scaling will distort the proportions.

Now, we have everything we need to run the reconstruction:

def reconstruct3d(model, img, sess):
    x, y = model
    # Change order to channels-first
    img = np.transpose(img, (2, 0, 1))

    vol = sess.run(y, feed_dict={x: np.array([img])})[0]
    # vol.shape = (200, 192, 192)

    # Convert image back to original order
    img = np.transpose(img, (1, 2, 0))
    return vol

The output is just a numpy array of dimension 3 where positive values indicate the voxel position (voxels are the generalization of pixels to the three dimensional space). You can use raw2obj.py to convert it into a colored mesh and write it as a OBJ-File for further processing. This simple text file format is understood by various 3D editing tools and libraries. We use three.js to render it with WebGL in the browser:

voxels The input image (left) and rendered output mesh (middle and right).

Obviously, the vrn can’t handle glasses, but the results are nevertheless impressive.

Brick model construction

Having a solution to the 3D reconstruction problem at hand it remains to find a LEGO® build layout that approximates the 3D body out of a limited set of pieces. This is also known as legoization or brickification.

The first step is to go back to a voxel representation. If the voxels are mapped onto 1x1 LEGO® bricks, the model doesn’t stand in general. So voxels of similar color have to be merged to bigger bricks until a stable structure consisting of one connected component is found. In general, this is a hard combinatorial optimization problem. It was twice openly presented by engineers from the LEGO® company in 1998 and 2001 2, and different solutions were proposed by using stimulated annealing 3, evolutionary algorithms 2, or graph theory 4.

In our case, we are lucky that the shape of the face mesh is just a deformed ball. So, the problem shouldn’t be that difficult to solve. First, we rasterize the face mesh with some fixed resolution in order to get voxels:

voxels Voxels for three different resolutions and counts 563, 3830, 16552 (from left to right).

Even though the basic bricks are available in many colors at the pick-a-brick LEGO® store, the color space is much smaller than the full RGB space.

voxels Selection of LEGO® colors (29): Black, Brick Yellow, Bright Blue, Bright Green, Bright Orange, Bright Purple, Bright Red, Bright Reddish Violet, Bright Yellow, Bright Yellowish Green, Cool Yellow, Dark Brown, Dark Green, Dark Orange, Dark Stone Grey, Earth Blue, Earth Green, Flame Yellowish Orange, Light Purple, Medium Azur, Medium Blue, Medium Lilac, Medium Stone Grey, Olive Green, Reddish Brown, Sand Green, Sand Yellow, Spring Yellowish Green, White.

Since we are interested in building a real-life object instead of just a virtual model, we need to convert the colors with minimal perceptual loss. For that, we map the original colors into the Lab color space and choose the nearest neighbor LEGO® color by using the Delta E 2000 color difference.

voxels Color mapping to 29 LEGO® colors, by using the L2 metric in RGB space, or Delta E 76, Delta E 94 and Delta E 2000 color differences in Lab space (from left to right).

The resulting conversion is not optimal yet, but good enough to keep going.

As we increase the resolution the number of voxels grows cubically which complicates the combinatorial problem and slows down the rendering. Therefore we carve out the inner invisible voxels and just keep a thin shell. Moreover, it suffices to drop the back of the face mesh because the front part already contains the facial geometry.

voxels Carved voxels with shell size 3. Only the visible voxels are colored.

The upshot of the reduced color palette is that we can merge the 1x1 bricks into larger bricks of the same color which will increase the stability and stiffness of the model. For simplicity, we work only with the basic brick types (1x1, 1x2, 1x3, 1x4, 1x6, 1x8, 2x2, 2x3, 2x4, 2x6, 2x8). As a first naive optimization algorithm, for each layer we merge repeatedly random adjacent bricks if the merged brick is admissable and if all underlying visible voxels have the same color.

Since this algorithm processes each layer independently, it doesn’t take into account the overall structure so that some bricks might be disconnected. In order to minimize this effect, for each layer, and each brick we chose to maximize the number of bricks below it connects with, and at the same time minimize the total number of bricks. This gives rise to a cost function that can evaluate any brick layout solution.

Now, we repeat our initial algorithm and replace a solution whenever the cost goes down. This meta-algorithm is also known as random-restart hill climbing. As a final postprocessing step, we compute the connected components of the whole brick layout and remove those that are disconnected from the ground. In most cases this gives an approximate brick layout which seems to be good enough.

voxels Result after 20 iterations. It has three connected components: A tiny part on the front marked as green (left), a tiny invisible part (middle) and the main component (right).

voxels Primary connected component, rendered with knobs.

Conclusion

Given the fantastic VRN models, it is quite easy to create a LEGO® layout from a single selfie. While the color conversion is far from perfect, it works very well for grayscale pictures or faces that are already close to the LEGO® colors.

Next, we are going to build a real life example and see how well our layout algorithm works in practice!

References

  1. Jackson, Aaron S and Bulat, Adrian and Argyriou, Vasileios and Tzimiropoulos, Georgios. Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression. International Conference on Computer Vision. 2017. 

  2. Petrovic, Pavel: Solving the LEGO brick layout problem using evolutionary algorithms. Tech. rep., Norwegian University of Science and Technology, 2001.  2

  3. Gower, Rebecca A H and Heydtmann, Agnes E and Petersen, Henrik G. LEGO: Automated Model Construction. 1998. 

  4. Testuz, Roman and Schwartzburg, Yuliy and Pauly, Mark. Automatic Generation of Constructable Brick Sculptures. 2013. 

Gunnar Aastrand Grimnes

Down the debugging rabbit-hole

The story of a 4 hour fun debugging adventure with pytests and tornado

We use Tornado as an async webserver for our python projects, and often pytest for testing.

The two of them come together nicely in pytest-tornado, which gives you pytest-marks for async/coroutine tests and pytest-fixtures for setting up/tearing down your application.

So, we set off to write some tests for a new project, we first added a login test:

test_login.py

@pytest.mark.gen_test
def test_login(http, login_credentials):

    r = yield http.post('/api/login', body=login_credentials)
    return = {"Cookie": str(r.headers["Set-Cookie"])}

It passes! Great!

Now, we added a test of upload a schema, the details don’t matter, it posts some JSON. Since it has to login first, we reuse the login method, which already returns the cookie we need:

test_schema.py

from test.test_login import test_login

@pytest.mark.gen_test
def test_schema(http):

    headers = yield test_login(http)

    http.post('/api/schema', body=[...], headers=headers)

    [... actually test something ...]

It also works!

Next test: test_answers – again the details don’t matter, it logs in, makes some HTTP requests and tests some things.

test_answers.py

from test.test_login import test_login

@pytest.mark.gen_test
def test_schema(http):

    headers = yield test_login(http)

    http.post('/api/answers', body=[...], headers=headers)

    [... actually test something ...]

Aaaand…. it fails with:

>       yield test_login()

test_answers.py:11:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
env/lib/python3.6/site-packages/tornado/gen.py:1055: in run
    value = future.result()
env/lib/python3.6/site-packages/tornado/concurrent.py:238: in result
    raise_exc_info(self._exc_info)
<string>:4: in raise_exc_info
    ???
env/lib/python3.6/site-packages/tornado/gen.py:1143: in handle_yield
    self.future = convert_yielded(yielded)
env/lib/python3.6/functools.py:803: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

yielded = <generator object test_login at 0x102420728>

    def convert_yielded(yielded):
        """Convert a yielded object into a `.Future`.

        The default implementation accepts lists, dictionaries, and Futures.

        If the `~functools.singledispatch` library is available, this function
        may be extended to support additional types. For example::

            @convert_yielded.register(asyncio.Future)
            def _(asyncio_future):
                return tornado.platform.asyncio.to_tornado_future(asyncio_future)

        .. versionadded:: 4.1
        """
        # Lists and dicts containing YieldPoints were handled earlier.
        if yielded is None:
            return moment
        elif isinstance(yielded, (list, dict)):
            return multi(yielded)
        elif is_future(yielded):
            return yielded
        elif isawaitable(yielded):
            return _wrap_awaitable(yielded)
        else:
>           raise BadYieldError("yielded unknown object %r" % (yielded,))
E           tornado.gen.BadYieldError: yielded unknown object <generator object test_login at 0x102420728>

env/lib/python3.6/site-packages/tornado/gen.py:1283: BadYieldError

WTF?

So there must be a stupid error, we check for typos, we go back and start copy/pasting code from the working test_schema to make sure we didn’t type @pytest.mark.test_gen or something. The failure remains.

After a while we reach the state where test_schema.py and test_answer.py is byte-for-byte identical, but answers fails and schema passes. We go home and rethink our lives.

Next day, we realise that when called on just one of those files, pytest will run TWO tests, it will find the test_login through the import as well as the test in the files we invoke on. And the order will be different, it will order the file alphabetically - so in case of test_answers it will first run that test, then test_login, but for test_schema the login test will run first.

WTF?

Renaming test_answers to test_manswers (sorted after login) confirms it, it then works.

But why does the order matter? Digging a bit deeper, we see that the value returned from test_login is in both cases of type generator. But Tornado is happy with one of them, but not the other. In the convert_yielded function (which among other things lets tornado also work with await/async generators), Tornado uses inspect.isawaitable to check if the passed generator can actually be a future. This is False when the test fails.

This is the code for isawaitable:

def isawaitable(object):
    """Return true if object can be passed to an ``await`` expression."""
    return (isinstance(object, types.CoroutineType) or
            isinstance(object, types.GeneratorType) and
                bool(object.gi_code.co_flags & CO_ITERABLE_COROUTINE) or
            isinstance(object, collections.abc.Awaitable))

It’s the co_flags line that causes out problem - in the working case, the flag for being an iterable coroutine is set. co_flags is pretty deep in the python internals, containing a number of flags for the interpreter (the inspect docs has the full list). Our CO_ITERABLE_COROUTINE flag was added in in PEP492, which says that:

The [types.coroutine()] function applies CO_ITERABLE_COROUTINE flag to generator- function’s code object, making it return a coroutine object.

And here the rabbit hole ends! We can inspect types.coroutine:


        # Check if 'func' is a generator function.
        # (0x20 == CO_GENERATOR)
        if co_flags & 0x20:
            if func.__name__ == 'test_b':
                import ipdb; ipdb.set_trace()
            # TODO: Implement this in C.
            co = func.__code__
            func.__code__ = CodeType(
                co.co_argcount, co.co_kwonlyargcount, co.co_nlocals,
                co.co_stacksize,
                co.co_flags | 0x100,  # 0x100 == CO_ITERABLE_COROUTINE
                co.co_code,
                co.co_consts, co.co_names, co.co_varnames, co.co_filename,
                co.co_name, co.co_firstlineno, co.co_lnotab, co.co_freevars,
                co.co_cellvars)
            return func


And there the function __code__ object is modified in place, setting the flag! Setting a breakpoint there lets us see that pytest-tornado calls tornado.gen.coroutine on our function, which in turn calls types.coroutine:

    # On Python 3.5, set the coroutine flag on our generator, to allow it
    # to be used with 'await'.
    wrapped = func
    if hasattr(types, 'coroutine'):
        func = types.coroutine(func)

And this is how the test_login function only works if once called first as a pytest.

¯\_(ツ)_/¯

In the end, that’s the explanation, but there is no real solution - we cannot rely on having the tests in alphabetical order, so we move the reusable code out to it’s own function:

@pytest.mark.gen_test
def test_login(http, login_credentials):
    return ( yield do_login(http, login_credentials) )

@coroutine
def do_login(http, login_credentials):
    r = yield http.post('/api/login', body=login_credentials)
    return = {"Cookie": str(r.headers["Set-Cookie"])}

Then we only import the do_login function elsewhere. You were probably not meant to reuse actual test functions like this anyway.

In fact, in python2 there is no isawaitable and both tests will fail, and only in python3 do you get this weird only if in the right alphabetical order bug.

That’s it! 4 hours of weirdness later. Note that normally Pytest + tornado are actually pretty good friends! Next time we’ll write a blogpost about how well it works!