2024-04-17

Mastering the "cref" parameter for MidJourney Character Consistency (feat. Creating a Spooky YouTube Short)

Great news for all MidJourney users! A new consistency parameter has been added to help maintain the same facial features across various poses and expressions of your characters.

Generative AI services like MidJourney, known for their randomness, require complex methods to maintain consistencyโ€”something weโ€™ve touched on in our previous VideoStew blog post.

We put this update to the test to quench our thirst (?) for character consistency.

Creating a Key Character (Reference Image)

Initially, we created a main character. Oh, and by the way, MidJourney has recently unveiled a web interface that allows image generation without needing Discord. (It is still in alpha phase and only available to heavy users.)

< MidJourney Web Interface >

We started by generating a reference image with the following prompt:

A Korean rapper passionately performing rap on stage, face tatoo, ear rings

The images generated were interesting, and we chose the second image to upscale for better resolution.

By right-clicking and copying the image address, the prep work for the reference image was completed.

Creating Image Variations

We then put our rapper in different scenarios and poses to test his versatility.

We used the following prompt with the --cref [reference image address] at the end of the prompt:

A korean rapper is reading a newspaper at home --cref [reference image address]

The results were quite satisfying.

< Our rapper reading at home >

It's also good to know about --cw (character weight). Without specifying, the default is --cw 100, which tries to retain all attributes of the reference. However, with --cw 0, we experimented again:

--cw value was set to 0, significantly altering the tattoos, hats, and other attributes except the face.

< Rapper with changed outfit and tattoos >

Can Character Consistency Be Used in Webtoon Production?

Since the basic functionality of the character consistency parameters worked, we quickly tried it out for webtoon production:

a Korean female university student, korean comics style

Out of the four results, we upscaled the second image and planned to try various poses and settings.

< Reference image in webtoon style >

We simplified the prompt to just โ€œa ladyโ€ while applying the --cref.

The result was a photorealistic version of the webtoon character, same style, same outfit, definitely the same person. Is this the real-life version of a webtoon character?

< The real-life webtoon girl appears>

Rechallenging with the Core Prompt

So, we tried again, maintaining the prompt but using the --cref to create an image of the protagonist sitting on a busโ€”and it was a huge success!

< The protagonist tearing out of and into the comics >

What if Two Characters Appear?

The prospect of using this for webtoons or storytelling is becoming more intriguing. What if we tried creating an image of a girl chatting with her boyfriend at a cafรฉ?

Wow, one out of four was a success, but strange images of clone-like characters chatting were generated.

We tried the Rerun function to regenerate 8 images. The result? It was disastrousโ€ฆ

Trying Style Consistency Prompts

If cref helps maintain character consistency, sref maintains stylistic consistency.

We attempted to create a scene with a horrifying atmosphere in a Korean comics style. First, we generated a terrifying setting:

An indoor setting with a terrifying atmosphere, korean comics style

After copying the address of the first reference image:

Then we applied both reference images together.

A Korean female college student running away in fear at subway station, korean comics style  --sref [reference image address] --cref [reference image address]

The result? A success! Our protagonist was created running in fear on a subway platform in the desired style.

Although the prompt was about fear and fleeing, it serendipitously played in our favor. Can we maintain the horrific style while depicting our Korean college student smiling and drinking coffee in a park?

Yes, it worked! The original color palette and feel of the style reference clearly manifested, even though the expression was smiling.

Creating Spooky YouTube Shorts

Building on our experiments, thereโ€™s potential to create a horror short. Simple scary stories were requested from GPT, and each matching image used the style and character consistency parameters.

The result after putting everything together on VideoStew?

Combine haunting background music and a suitable AI voice, and voilaโ€”a decent quality video is easily crafted. Preparing the required images for each sentence beforehand made the process incredibly fast.

So, what's the verdict?

Using cref and sref together, itโ€™s feasible to maintain graphic style and face consistency for single-character content production. However, for story progression that requires multiple characters, this still seems challenging. Combining it with the methods used in the past (refer to our past MidJourney Character Consistency Posting) might work to some extent, but donโ€™t get your hopes too high. With generative AI services evolving daily, let's hope for an update that handles multi-character consistency next time!

