Sink In
Posts
DALL-E as Battlefield Tool, Coachella animation, Perturbed-Attention Guidance, Pixart Sigma, InstantMesh

DALL-E as Battlefield Tool, Coachella animation, Perturbed-Attention Guidance, Pixart Sigma, InstantMesh

April 17, 2024

Hello - this is SinkIn Newsletter, a newsletter made by sinkin.ai. There are interesting things going on in the image AI world every day. We try to capture them with a 5 minutes read, so you can quickly stay up to speed with the latest trends and breakthroughs.

Microsoft Pitched OpenAI’s DALL-E as Battlefield Tool for U.S. Military

Microsoft has proposed leveraging OpenAI's DALL-E, an AI image generation tool, for U.S. military applications, despite OpenAI's stated mission to avoid harm and weaponry development. This suggestion surfaced amidst OpenAI’s recent shift away from prohibiting military work, marking a potential pivot in the company's ethical stance. The proposal detailed using DALL-E's synthetic image generation to train military battle management systems, aiding in target recognition and coordination during combat. While OpenAI denies involvement or selling tools for these purposes, the discussion raises questions about the ethical implications of AI's role in military operations and the potential indirect contribution to conflict.

Workflow: how the animation played at Coachella during Anymas + Grimes song debut was created

This is the ComfyUI workflow used to create an animation video played at Coachella (check out the video below). The workflow utilizes two IPAdapters and an alpha mask to separate the subject and the background so you have total control over both and they are not tied to one another. You’ll also find a video tutorial walkthrough of the workflow on the Civitai page as well.

The Coachella animation video

Perturbed-Attention Guidance is the real thing - increased fidelity, coherence, cleaned upped compositions

Recently implemented in ComfyUI, Perturbed-Attention Guidance (PAG) is changing the game by enhancing prompt adherence and composition coherence without sacrificing image quality. This new method outshines others by maintaining the integrity of image fidelity while bringing structure to complex visual narratives. Check out the user-shared basic pipeline settings and impressive A/B image examples. Experiment with the recommended checkpoints for different styles, and see how PAG, along with the optional AutomaticCFG, can transform your AI-generated art into coherent masterpieces.

Pixart Sigma HF Space (a new model with great prompt alignment capabilities)

Generate high-fidelity 4K images from text prompts using PixArt-Sigma, a state-of-the-art diffusion model. PixArt-Sigma achieves excellent alignment with prompts. It does so efficiently, evolving from PixArt-alpha through a process termed weak-to-strong training - leveraging higher quality data and an improved attention mechanism. With just 0.6 billion parameters, PixArt-Sigma reaches new heights in text-to-image generation.

prompt: orange cat wrapped in white bandages and black dog wrapped in red bandages sitting on a bench on top of a hill filled with round stones, photo, cinematic

InstantMesh: Efficient 3D Mesh Generation from a Single Image

The InstantMesh framework from Tencent ARC presents an efficient solution for converting a single image into a 3D mesh, combining a conventional multiview diffusion model with a sparse-view reconstruction model for improved generation quality and training scalability. This system is designed to produce varied 3D assets swiftly, with generation times averaging around 10 seconds. Comparative studies with public datasets indicate that InstantMesh provides superior performance over current image-to-3D conversion methods. For broader application and community contribution, the team behind InstantMesh has made the code, weights, and a demonstrative application openly available. Check out the HuggingFace demo.

Meme of the Day

That’s it for today, see you next time!