Turning Stable Diffusion images into animated characters
- Stable Diffusion v1.5
- Automatic1111 WebUI
- Artstudio Pro
- ZBrush
- Substance Painter
When creating the game characters I had three main requirements:
- Must have the same style as the comic and game world
- Character should remain consistent throughout pipeline
- The process should turn a Stable Diffusion generated image into a 3D model
I start out with a character reference sheet to ensure all (male) characters have similar proportions and layout, to help keep the base workflow optimised. I found this example online, it’s a drawing by illustrator Andrew Loomis adapted by Jack Monahan of gausswerks.com
To design my characters I mainly work in Artstudio Pro on Ipad, since I like the form factor as a tablet device which is close to feeling like a sketchbook (also using a paperlike screen protector to provide some resistance for the Apple Pencil and avoid the smooth glass finish).
First I draw rough outlines of the character outfit over the top of the reference guide
Then I do the flat color fills to define the overall color scheme for the character
The final pass is some basic shading and texture that will help inform the clothing materials and details when processing the image with Stable Diffusion. The outlines are still visible and this doesn’t actually matter much in the img2img process, though you could paint them out.
When using img2img in Automatic1111 WebUI, I generally do batch generations of 9 outputs at 512x768 resolution, whilst adjusting the Denoising Strength between 0.3-0.7 to get a good variation of elements that I can composite together in Photoshop to create the final output.
I first used img2img for the front sketch:
Oil painting portrait of a futuristic merchant
Next I repeated with the rear view sketch, though I could probably do both in one go I keep them separate so I can alter the prompt to explicitly include rear view:
Oil painting portrait rear view of a futuristic merchant
In Photoshop I pick the best outputs and layer them on top of each other, using layer masks to paint the specific regions I want to replace, and do extra passes of digital painting on top to blend it all together.
Here is the final comparison between the rough sketch I made and the final composite of img2img results. Overall I’m happy with the transfer of style from my original design, though typically for current AI generations the hands are terrible and I have to fix these later.
I use the initial character sketch with outlines to help guide my modelling in ZBrush, by loading the image into Spotlight with a low opacity. My current method is to use DynaMesh to form the main shapes of the character, which enables you to add extra volume by pulling areas with the Move Brush and it will generate the extra polygons to keep a uniform surface.
I then load the final front character image into Spotlight, and using a Brush Tool set to RGB and Zadd off, I Polypaint the image information onto the mesh where it is stored in the vertex colors.
This is done for the front side and then the rear side, just as an initial reference to help guide the modelling of the rest of the details. (make sure RGB is off for brushes now when doing sculpting)
I increase the resolution of the DynaMesh model and progressively sculpt the new details and re-project the image color data with Polypaint until I’m happy with the main forms. I avoid doing the hands since I will add these in later on.
Due to the front and back projections not covering all areas of the model, I manually paint rough color information to fill the missing areas by sampling the nearby colors.
I added the missing back part of the jacket by appending a cube mesh and boolean merging it with the DynaMesh model. Next I paint new color detail on top of the missing areas to try and match the style of the original projected images and help inform any further sculpting that needs doing.
Before doing a final sculpting pass on the character, I create Polygroups for the regions I want to UV unwrap, then use ZRemesher to get a lower poly count model. After increasing the Subdivision levels to 6, I project the details from the sculpted DynaMesh model.
To finish Polypainting the model, I find an area with rough details and take a screenshot of the Zbrush Canvas. This image is fed into WebUI img2img to generate the new details by using a Denoising Scale of 0.3-0.5 so that it closely follows the colors and shapes of the initial image. It usually takes a few generations with different settings until I land on the result I am wanting.
I can then load the newly generated image into the ZBrush Spotlight viewer, line it up to my character mesh and project the details with Polypaint similar to how I did the initial front and back projections.
I use UVMaster to unwrap the model using the Polygroups I previously created so my model now has UVs. Next I create a Texture Map using New From Polypaint to store the painted vertex color information in a 4K texture.
The creation of the Normal Map details for the lower poly character model can either be done in ZBrush using Multi Map Exporter or in Substance Painter by baking Mesh Maps using the low poly and high poly versions of the model exported from ZBrush. I used the latter because I needed to add the low poly hand geometry to my model and I manually painted the new hand texture in Substance Painter.
In Substance Painter I created a Roughness Map using base roughness values and a combination of grunge maps to add variety and surface reflection breakup. The final textures were exported for use as Unity URP material (Base Map, Normal Map, Metallic Map, Occlusion Map).
I uploaded the low poly version of my character to Mixamo as an FBX with base color texture included, to generate a rough skeletal rig and preview animations. This is really helpful for early prototyping to get an idea how the characters look in the game, as later on I will be recording my own motion capture animations for the game characters.
I achieved a workflow that met my initial goals of generating 3D characters from Stable Diffusion images, which were in keeping with the style of my comic. Some processes definitely have room for improvement and optimisation:
- Creation of a subdivided base mesh as starting point for similar characters
- Doing a second texture projection pass in Substance Designer for higher quality textures
- Custom motion capture animations specific to the character
- Further exploration of other AI processes to automate or enhance stages of the workflow