Sink In
Posts
ToonCrafter: Create Cartoon with Two Images

ToonCrafter: Create Cartoon with Two Images

June 10, 2024

Suuup - this is SinkIn Newsletter, a 5 minutes read made at sinkin.ai to cover the most interesting stuff in the Image AI world.

ToonCrafter: Create Cartoon with Two Images

ToonCrafter is a new method for cartoon creation which overcomes the limitations of traditional techniques. With a start frame and an end frame as input, it generates a smooth cartoon video. It uses a toon rectification learning strategy, a dual-reference-based 3D decoder, and a flexible sketch encoder to ensure high-quality, detailed, and user-controllable results. Experimental results show that ToonCrafter produces more natural dynamics than other existing methods. It could be a great help to the anime production industry.

Showcase Video

Stable Audio Open 1.0 Weights Released

Stability AI has introduced Stable Audio Open, an open-source model designed to generate up to 47-second audio samples from text prompts. This model is ideal for creating drum beats, instrument riffs, ambient sounds, and production elements. Unlike the commercial Stable Audio, which can produce full tracks, Stable Audio Open focuses on short audio samples and allows users to fine-tune the model with their custom audio data. Model weights are available on Hugging Face.

Stable Diffusion 3 Release on June 12

Stability AI plans to release Stable Diffusion 3 Medium on June 12th. It’s a 2 billion parameter text-to-image model. The weights will be available on Hugging Face. It excels in areas where previous SD models struggled. Specifically it features photorealism, superior typography, optimized performance, and precise fine-tuning capabilities. Larger versions (4b and 8b) will be released as they get finished according to one Stability staff.

Omost: use LLM to create code that generates images

The creator of ControlNet (Github handle: @llyasviel) just launched a new project: Omost. It is a project designed to enhance image generation by converting LLM’s coding capabilities into image composing abilities. It uses virtual "Canvas" agents to write code that composes visual content, which can be rendered into images. The project includes three pretrained models based on Llama3 and Phi3, trained with diverse datasets and reinforcement techniques. You can try it out with the demo on HuggingFace.

Full Tutorial + Workflow - ComfyUI Virtual Clothing Try On

This is a tutorial on how to do virtual clothing try on with ComfyUI. It’s using ControlNet, IPAdapter and Reactor Face Swap. The result looks quite good. This tutorial starts from basic ComfyUI install so it’s pretty friendly to someone who’s new to ComfyUI. Take a look if you’re interested.

Tutorial Video

Meme of the Day

Celebrities in Everyday Roles

That’s it for today, hope they are as refreshing as a nice bathroom sink!