Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,8 @@
title: LatteTransformer3DModel
- local: api/models/ltx_video_transformer3d
title: LTXVideoTransformer3DModel
- local: api/models/lumina2_accessory_transformer2d
title: Lumina2AccessoryTransformer2DModel
- local: api/models/lumina2_transformer2d
title: Lumina2Transformer2DModel
- local: api/models/lumina_nextdit2d
Expand Down
31 changes: 31 additions & 0 deletions docs/source/en/api/models/lumina2_accessory_transformer2d.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->

# Lumina2AccessoryTransformer2DModel

A Diffusion Transformer model for 2D data from [Lumina-Accessory](https://github.com/Alpha-VLLM/Lumina-Accessory). by Alpha-VLLM.

The model can be loaded with the following code snippet.

```python
from diffusers import Lumina2AccessoryTransformer2DModel

ckpt_path = "https://huggingface.co/Alpha-VLLM/Lumina-Accessory/blob/main/consolidated.00-of-01.pth"
transformer = Lumina2AccessoryTransformer2DModel.from_single_file(ckpt_path, torch_dtype=torch.bfloat16)
```

## Lumina2AccessoryTransformer2DModel

[[autodoc]] Lumina2AccessoryTransformer2DModel

## Transformer2DModelOutput

[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
45 changes: 45 additions & 0 deletions docs/source/en/api/pipelines/lumina2.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,53 @@ image = pipe(
image.save("lumina-gguf.png")
```

## Lumina Accessory

Lumina-Accessory is a multi-task instruction fine-tuning framework designed for the Lumina series. The official repository is from [Alpha-VLLM/Lumina-Accessory](https://github.com/Alpha-VLLM/Lumina-Accessory)

```python
import torch
from diffusers import Lumina2AccessoryPipeline, Lumina2AccessoryTransformer2DModel
from diffusers.utils import load_image

ckpt_path = "https://huggingface.co/Alpha-VLLM/Lumina-Accessory/blob/main/consolidated.00-of-01.pth"
transformer = Lumina2AccessoryTransformer2DModel.from_single_file(ckpt_path, torch_dtype=torch.bfloat16)
pipe = Lumina2AccessoryPipeline.from_pretrained(
"Alpha-VLLM/Lumina-Image-2.0", transformer=transformer, torch_dtype=torch.bfloat16
)

# Enable memory optimizations.
pipe.enable_model_cpu_offload()

img = load_image("https://github.com/Alpha-VLLM/Lumina-Accessory/blob/main/examples/case_1_condition.jpg?raw=true")
prompt = "A classical oil painting of a young woman dressed in a modern DARK BLACK leather jacket."
system_prompt = "You are an assistant designed to generate superior images with the highest degree of image-text alignment based on textual prompts and a partially masked image."
image = pipe(
image=img,
prompt=prompt,
system_prompt=system_prompt,
width=img.size[0],
height=img.size[1],
negative_prompt=" ",
num_inference_steps=25,
num_images_per_prompt=1,
guidance_scale=4.0,
cfg_trunc_ratio=1.0,
cfg_normalization=True,
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("lumina2_accessory_image_infliling.png")
```

## Lumina2Pipeline

[[autodoc]] Lumina2Pipeline
- all
- __call__


## Lumina2AccessoryPipeline

[[autodoc]] Lumina2AccessoryPipeline
- all
- __call__
4 changes: 4 additions & 0 deletions src/diffusers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,7 @@
"Kandinsky3UNet",
"LatteTransformer3DModel",
"LTXVideoTransformer3DModel",
"Lumina2AccessoryTransformer2DModel",
"Lumina2Transformer2DModel",
"LuminaNextDiT2DModel",
"MochiTransformer3DModel",
Expand Down Expand Up @@ -496,6 +497,7 @@
"LTXLatentUpsamplePipeline",
"LTXPipeline",
"LucyEditPipeline",
"Lumina2AccessoryPipeline",
"Lumina2Pipeline",
"Lumina2Text2ImgPipeline",
"LuminaPipeline",
Expand Down Expand Up @@ -906,6 +908,7 @@
Kandinsky3UNet,
LatteTransformer3DModel,
LTXVideoTransformer3DModel,
Lumina2AccessoryTransformer2DModel,
Lumina2Transformer2DModel,
LuminaNextDiT2DModel,
MochiTransformer3DModel,
Expand Down Expand Up @@ -1151,6 +1154,7 @@
LTXLatentUpsamplePipeline,
LTXPipeline,
LucyEditPipeline,
Lumina2AccessoryPipeline,
Lumina2Pipeline,
Lumina2Text2ImgPipeline,
LuminaPipeline,
Expand Down
4 changes: 4 additions & 0 deletions src/diffusers/loaders/single_file_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,10 @@
"checkpoint_mapping_fn": convert_lumina2_to_diffusers,
"default_subfolder": "transformer",
},
"Lumina2AccessoryTransformer2DModel": {
"checkpoint_mapping_fn": convert_lumina2_to_diffusers,
"default_subfolder": "transformer",
},
"SanaTransformer2DModel": {
"checkpoint_mapping_fn": convert_sana_transformer_to_diffusers,
"default_subfolder": "transformer",
Expand Down
2 changes: 2 additions & 0 deletions src/diffusers/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@
_import_structure["transformers.transformer_hunyuan_video_framepack"] = ["HunyuanVideoFramepackTransformer3DModel"]
_import_structure["transformers.transformer_ltx"] = ["LTXVideoTransformer3DModel"]
_import_structure["transformers.transformer_lumina2"] = ["Lumina2Transformer2DModel"]
_import_structure["transformers.transformer_lumina2_accessory"] = ["Lumina2AccessoryTransformer2DModel"]
_import_structure["transformers.transformer_mochi"] = ["MochiTransformer3DModel"]
_import_structure["transformers.transformer_omnigen"] = ["OmniGenTransformer2DModel"]
_import_structure["transformers.transformer_qwenimage"] = ["QwenImageTransformer2DModel"]
Expand Down Expand Up @@ -182,6 +183,7 @@
HunyuanVideoTransformer3DModel,
LatteTransformer3DModel,
LTXVideoTransformer3DModel,
Lumina2AccessoryTransformer2DModel,
Lumina2Transformer2DModel,
LuminaNextDiT2DModel,
MochiTransformer3DModel,
Expand Down
1 change: 1 addition & 0 deletions src/diffusers/models/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
from .transformer_hunyuan_video_framepack import HunyuanVideoFramepackTransformer3DModel
from .transformer_ltx import LTXVideoTransformer3DModel
from .transformer_lumina2 import Lumina2Transformer2DModel
from .transformer_lumina2_accessory import Lumina2AccessoryTransformer2DModel
from .transformer_mochi import MochiTransformer3DModel
from .transformer_omnigen import OmniGenTransformer2DModel
from .transformer_qwenimage import QwenImageTransformer2DModel
Expand Down
Loading