Chroma 全能创意/摄影/动漫模型是一个基于 FLUX.1-schnell 的 8.9B 参数模型,采用 Apache 2.0 许可证,完全开源。该模型的训练数据集从 20M 样本中精心挑选出 5M 数据,涵盖动漫、兽类、艺术作品和照片等多种类型。
不久前,我发布了关于 Chroma 的文章,这是我正在进行的开源基础模型。我收到了大量很棒的反馈,我很高兴地宣布,基础模型训练终于完成了,现在所有模型系列都可以供大家使用了!
快速回顾一下这里的承诺:这些是真正的基础模型。
我没有做过任何美学调整,也没有使用过像 DPO 这样的训练后工具。它们原始、强大,旨在为您提供完美、中性的微调起点。我们做了繁重的工作,所以您无需再费心。
我所说的“繁重”指的是大约105,000 个 H100 小时的计算时间。所有这些 GPU 时间都用于将这些模型与海量数据分布打包在一起,这应该会使在模型上进行微调变得轻而易举。
正如承诺的那样,一切都完全符合 Apache 2.0 许可——无需把关。
关于v1.0-HD版本
这是 Chroma1-Base 的高分辨率微调版本,分辨率为 1024×1024。如果您需要快速微调或 LoRA 以获得高分辨率,那么这就是您的起点。
配套下载包含(可选)
Chroma 专业摄影LoRA

V2.0
稍微不同的方法。将较小的 Lora 批次合并到检查点,并从中提取一个 Lora。这样管理起来会好很多,因为图片太多、主题不同,收敛时间太长。图片较少时,可以更快、更容易地判断何时训练完成。
对效果非常满意。它能去除照片边缘,在某些情况下,这些边缘看起来仍然很不自然,或者看起来像是PS过的角色。
对于任何与人或动物有关的东西,我会将强度保持在较低水平(大约 0.7 左右),因为它可能会破坏解剖结构,而色度足以处理其余部分。
估计已经完成了。我觉得再加点图片也没什么用。
Chroma 亚洲人像LyCORIS

本训练资料为模型中心使用,包含9527 Detail Realistic XL,是东方脸孔的主要训练资料,其中也包含自学训练资料。
触发词 | photorealistic asian woman |
Chroma 工作流

Chroma Modular WF with DetailDaemon, Inpaint, Upscaler and FaceDetailer.
与我的 HiDream 工作流程一样,这将允许您使用:
– txt2img or img2img,
-Detail-Daemon,
-Inpaint,
-HiRes-Fix,
-Ultimate SD Upscale,
-FaceDetailer.
您需要一个 t5xxl 文本编码器模型文件,您可以在此代码库(包含下载)中找到。推荐使用 fp16,如果内存不足,建议使用 fp8_scaled。将其放在 ComfyUI/models/text_encoders/ 文件夹中。VAE 与 FLUX 或 HiDream 相同,因此您应该已经拥有它。
重要提示:运行工作流之前,必须在“加载图像”节点(位于“提示”节点左侧)中加载图像(任意图像);否则,工作流会报错。仅在首次使用工作流时才需要执行此操作,即使您不打算将该图像用于 img2img 或 inpaint,也必须执行此操作。
作品参照

A high quality photo taken on a Kodak Portra 400 55mm analog camera from Flickr. In a surreal landscape, A Korean teenager female, and a futuristic black cat with piercing lavender colored eyes and a black tail is seen in the distance. But instead of being vulnerable, it wears the iconic Crimson bow, adorned with sparkling Glasses, an unexpected accessory that seems to be hidden behind its furry body. The background is an intricately detailed design featuring swirling patterns and vibrant colors adorning its face, Raining, in focus, Intense, Fairy-Tale, Fast Shutter Speed
Negative prompt: This greyscale unfinished sketch has bad proportions, is featureless and disfigured. It is a blurry ugly mess and with excessive gaussian blur. It is riddled with watermarks and signatures. Everything is smudged with leaking colors and nonsensical orientation of objects. Messy and abstract image filled with artifacts disrupt the coherency of the overall composition. The image has extreme chromatic abberations and inconsistent lighting. Dull, monochrome colors and countless artistic errors. 6 toes. six toes. 6 fingers. six fingers. supernumerary fingers. anatomically inacurrate. deformed. disfigured. duplicate. twin. bad hands. bad foots.
Sampler: er_sde

A warm and inviting still-life of a traditional Eastern European soup served in a decorated ceramic bowl, resting on a wooden table. The soup is vibrant with red broth, meatballs, and sprinkled herbs, accompanied by fresh bread slices, a small bowl of sour cream, and a shot glass of clear liquor. A wooden pepper mill and stacked bowls sit in the background, with a folded embroidered napkin in the foreground adding cultural detail. Sunlight streams softly into the scene, highlighting the textures of the bread crust, ceramic glaze, and steam rising faintly from the soup. The atmosphere feels homely, rustic, and cinematic, evoking a cozy family meal in a countryside kitchen. <lora:-_Chroma_-_profphotos_high_res_concatextr_2.0:0.8>
Steps: 30, CFG scale: 4.000000000000001, Sampler: res_multistep_beta, Seed: 856372928526614, VAE: ae.safetensors, Model: Chroma1-HD-Q8_0.gguf, Model hash: 8c8d66177b, Clip skip: 1

aesthetic 9, digital_media_(artwork), surreal, cinematic, dark_theme, absurd_res. An extreme close-up of a brutal orc warriorâs face, covered in smeared blood and dirt. His rough gray-green skin is scarred and cracked, sharp yellow fangs jutting out from his mouth. One glowing red eye stares with raw fury, filling the frame with menace. The shallow depth of field blurs the background, focusing entirely on the grotesque details of his skin texture, wounds, and monstrous teeth. The atmosphere is gritty and war-torn, evoking the tension of a battlefield. Rendered with film-like grain, muted earthy tones, and dramatic lighting, the image feels like a frame torn from a grim dark-fantasy epic. <lora:-_Chroma_-_profphotos_high_res_concatextr_2.0:0.8>
Steps: 30, CFG scale: 4.000000000000001, Sampler: res_multistep_beta, Seed: 421130046251498, VAE: ae.safetensors, Model: Chroma1-HD-Q8_0.gguf, Model hash: 8c8d66177b, Clip skip: 1