Abstract. We present WIR3D, a technique for abstracting 3D shapes through a sparse set of visually meaningful curves in 3D. We optimize the parameters of Bezier curves such that they faithfully represent both the geometry and salient visual features (e.g. texture) of the shape from arbitrary viewpoints. We leverage the intermediate activations of a pre-trained foundation model (CLIP) to guide our optimization process. We divide our optimization into two phases: one for capturing the coarse geometry of the shape, and the other for representing fine-grained features. Our second phase supervision is spatially guided by a novel localized keypoint loss. This spatial guidance enables user control over abstracted features. We ensure fidelity to the original surface through a neural SDF loss, which allows the curves to be used as intuitive deformation handles. We successfully apply our method for shape abstraction over a broad dataset of shapes with varying complexity, geometric structure, and texture, and demonstrate downstream applications for feature control and shape deformation.
Gallery of results. WIR3D is capable of abstracting a wide range of geometries and textures. 3D cubic Bezier curves are optimized to plausibly represent the textured shape from arbitrary viewpoints, and the results demonstrate that our method is capable of producing sparse strokes which abstract visual features from various perspectives without issue despite the lack of occlusion relationships.
Method overview. Our strokes are represented by 3D cubic Bezier curves. In the first stage, we initialize curves to the shape using furthest distance sampling, and the curves are optimized against Blender Freestyle renders to abstract the coarse geometry of the shape. The 3D regularization loss ensures the curves stay close to the shaep surface and are visible from all viewpoints. The semantic loss computes the L2 distance between the CLIP embeddings of the Freestyle renders and the curve renders. In the second stage, we freeze the curves from the first stage and add new curves that are optimized to represent the shape's texture. The new curves are initialized based on semantic keypoints on the surface which can be user specified or automatically detected. The keypoints also enable spatial localization of the semantic loss, which ensures the optimization is guided by the semantic features identified by the keypoints. Importantly, the second stage of training ensures that visually salient texture and geometry features are abstracted.
Localized keypoint loss. Our localized keypoints weight the loss between the intermediate feature maps of the encoded curve render \(I_{\text{curve}}\) and the target shape render \(I_{\text{target}}\). This weight is obtained through projecting 3D keypoints (red), followed by a Gaussian filter to obtain the weight map \(I_{\text{weight}}\). This loss focuses the optimization on visual features local to the keypoint.
Level of Abstraction Control. The level of abstraction is implicitly controlled by the number of curves. As the curve count grows, the abstraction captures more fine details.
Deformation Application. Our deformation application exploits the close correspondence between the optimized curves and key visual features on the input surface, thanks to the SDF and keypoint localization losses. We develop a simple skinning system for the surface where each vertex is assigned a set of skinning weights to points sampled on all the curves in the scene. These skinning weights are based on the L2 distance between each vertex and sampled point, and a softmax is applied to ensure they sum to 1. Transformations to each curve can then be automatically mapped to the surface through these skinning weights, and the procedure can be performed at interactive speeds. Note that no smoothing postprocess is applied to the mapped transformations, and the smoothness of the deformations are a result of the effectiveness of the curves in interpolating the quantities along the surface.
Keypoint Control Application. Our localized weighting framework allows for user control over which features are represented in the abstraction. After optimization, the user can further refine the curves by selecting additional keypoints, and new curves can be quickly optimized to add detail to the feature of interest. Keypoints can be used to make specific structures more explicit (e.g. wheels on the plane) or to add texture detail (e.g. nefertiti headband). The refinement is rapid, and is completed in a few hundred iterations, or around a minute.