The scan surface and predicted correspondence are used to optimize model parameters to generate a registered template surface (bottom right). Each convolution is implemented as a resblock unless otherwise indicated, all layers have 64 output features. The raw input scan is pre-processed to a standard mesh resolution and all pooling/convolution layers are pre-computed.įeature downsampling is performed by max-pooling, while upsampling uses interpolation. Multi-scale models propagate information between coarse and fine layers, enabling the model to take advantage of relationships between features at different resolutions. Cognisant of the challenges inherent to spectral representations, we operate on the explicit surface shape with the aim to learn. To learn a surface-to-template correspondence model, we propose a U-net style architecture operating on X directly. In our method we take inspiration from this fusion of top-down and bottom-up techniques, but argue that better results can be had by swapping out the general deformation model for pre-trained articulated human surface models.įurthermore, rather than discarding the local structure of the input surface (à la PointNet) we employ an intrinsic mesh-based regression model. This obviates the need for post-hoc filtering but effectively replaces one optimization step with another, as the initial output of the surface autoencoder is generally far from optimal. These methods preserve local structure by modeling the decoder as deformation of a template surface (with appropriate regularization).
1.1 Regression vs OptimizationĪs of this writing, the most successful methods for human correspondence matching, according to the FAUST correspondence challenge, are built on PointNet-style encoders paired with template-deformation decoder networks 1 1 1Note that many of the methods ranked on the public FAUST benchmark do not represent correspondence matching attempts, and may instead be autoencoders that rely on manual landmarking: Figure 1: Raw scans (left) are registered to a generative model (right) by way of soft-correspondence predictions (middle). We tackle each part of this challenging task, leveraging prior information where available and developing novel methods to overcome gaps in current techniques. These practical applications reinforce the need for methods that can cope with real scan data exhibiting measurement noise, occlusions/partialities, and areas of self-contact. Interest has only grown with the rise of teleconferencing, remote medicine, augmented/virtual reality. Indeed, correspondence matching for human body surfaces has attracted considerable academic attention and proven to be a challenging problem in geometric computer vision. In many applications, human subjects are the object of interest in these scans, generating a need for fast and robust methods to interpret nonrigid surfaces.
With the increased availability of high-quality 3D scanning systems (depth cameras, LIDAR, stereo-photogrammetry, etc.) it is necessary to develop suitable techniques to make sense of the massive influx of geometric data. WeĮvaluate the proposed method on the FAUST correspondence challenge where we
Occlusions, partialities, and varying genus (e.g. The pairing of a mesh-convolutional network with generative model fittingĮnables us to predict correspondence for real human surface scans including Maximally leverage domain-specific prior knowledge. By employing pre-trained human surface parametric models we U-net correspondence predictions to guide a parametric Iterative Closest Point Our second contribution is a generative optimization algorithm that uses the Modeling correspondence as Euclidean proximity enablesĮfficient optimization, both for network training and for the next step of the Soft-correspondence is formulated as coordinates in a newly-constructedĬartesian space. Our first major contribution is an intrinsic convolutional mesh U-netĪrchitecture that predicts pointwise correspondence to a template surface. Methods to fit a parametric template model to raw scan meshes.
Propose an elegant fusion of regression (bottom-up) and generative (top-down) Interpret geometric data, particularly for human subjects. The proliferation of 3D scanning technology has driven a need for methods to U-mesh: Human Correspondence Matching with Mesh Convolutional Networks