CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
huggingface

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: huggingface/notebooks
Path: blob/main/diffusers/clip_guided_images_mixing_with_stable_diffusion.ipynb
Views: 2932
Kernel: Unknown Kernel

CLIP Guided Images Mixing With Stable Diffusion

CLIP guided stable diffusion images mixing pipeline allows to combine two images using standard diffusion models. This approach is using (optional) CoCa model to avoid writing image description. This script was contributed by Karachev Denis and notebook by Parag Ekbote.

pip install torch matplotlib Pillow diffusers transformers open_clip_torch
Requirement already satisfied: torch in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (2.2.1+cu121) Requirement already satisfied: matplotlib in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (3.8.2) Requirement already satisfied: Pillow in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (11.1.0) Requirement already satisfied: diffusers in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (0.32.2) Requirement already satisfied: transformers in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (4.48.3) Collecting open_clip_torch Downloading open_clip_torch-2.30.0-py3-none-any.whl.metadata (31 kB) Requirement already satisfied: filelock in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (3.17.0) Requirement already satisfied: typing-extensions>=4.8.0 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (4.12.2) Requirement already satisfied: sympy in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (1.13.3) Requirement already satisfied: networkx in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (3.4.2) Requirement already satisfied: jinja2 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (3.1.5) Requirement already satisfied: fsspec in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (2025.2.0) Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (12.1.105) Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (12.1.105) Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (12.1.105) Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (8.9.2.26) Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (12.1.3.1) Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (11.0.2.54) Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (10.3.2.106) Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (11.4.5.107) Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (12.1.0.106) Requirement already satisfied: nvidia-nccl-cu12==2.19.3 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (2.19.3) Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (12.1.105) Requirement already satisfied: triton==2.2.0 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from torch) (2.2.0) Requirement already satisfied: nvidia-nvjitlink-cu12 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch) (12.8.61) Requirement already satisfied: contourpy>=1.0.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (1.3.1) Requirement already satisfied: cycler>=0.10 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (4.55.8) Requirement already satisfied: kiwisolver>=1.3.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (1.4.8) Requirement already satisfied: numpy<2,>=1.21 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (1.26.4) Requirement already satisfied: packaging>=20.0 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (24.2) Requirement already satisfied: pyparsing>=2.3.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (3.2.1) Requirement already satisfied: python-dateutil>=2.7 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from matplotlib) (2.9.0.post0) Requirement already satisfied: importlib-metadata in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from diffusers) (8.6.1) Requirement already satisfied: huggingface-hub>=0.23.2 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from diffusers) (0.28.1) Requirement already satisfied: regex!=2019.12.17 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from diffusers) (2024.11.6) Requirement already satisfied: requests in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from diffusers) (2.32.3) Requirement already satisfied: safetensors>=0.3.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from diffusers) (0.5.2) Requirement already satisfied: pyyaml>=5.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from transformers) (6.0.2) Requirement already satisfied: tokenizers<0.22,>=0.21 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from transformers) (0.21.0) Requirement already satisfied: tqdm>=4.27 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from transformers) (4.67.1) Requirement already satisfied: torchvision in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from open_clip_torch) (0.17.1+cu121) Collecting ftfy (from open_clip_torch) Downloading ftfy-6.3.1-py3-none-any.whl.metadata (7.3 kB) Collecting timm (from open_clip_torch) Downloading timm-1.0.14-py3-none-any.whl.metadata (50 kB) Requirement already satisfied: six>=1.5 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib) (1.17.0) Requirement already satisfied: wcwidth in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from ftfy->open_clip_torch) (0.2.13) Requirement already satisfied: zipp>=3.20 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from importlib-metadata->diffusers) (3.21.0) Requirement already satisfied: MarkupSafe>=2.0 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from jinja2->torch) (3.0.2) Requirement already satisfied: charset-normalizer<4,>=2 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from requests->diffusers) (3.4.1) Requirement already satisfied: idna<4,>=2.5 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from requests->diffusers) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from requests->diffusers) (2.3.0) Requirement already satisfied: certifi>=2017.4.17 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from requests->diffusers) (2025.1.31) Requirement already satisfied: mpmath<1.4,>=1.1.0 in /system/conda/miniconda3/envs/cloudspace/lib/python3.10/site-packages (from sympy->torch) (1.3.0) Downloading open_clip_torch-2.30.0-py3-none-any.whl (1.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 127.8 MB/s eta 0:00:00 Downloading ftfy-6.3.1-py3-none-any.whl (44 kB) Downloading timm-1.0.14-py3-none-any.whl (2.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.4/2.4 MB 141.5 MB/s eta 0:00:00 Installing collected packages: ftfy, timm, open_clip_torch Successfully installed ftfy-6.3.1 open_clip_torch-2.30.0 timm-1.0.14 Note: you may need to restart the kernel to use updated packages.
import PIL import torch import requests import open_clip from open_clip import SimpleTokenizer from io import BytesIO from diffusers import DiffusionPipeline from transformers import CLIPImageProcessor, CLIPModel def download_image(url): response = requests.get(url) return PIL.Image.open(BytesIO(response.content)).convert("RGB") # Loading additional models feature_extractor = CLIPImageProcessor.from_pretrained( "laion/CLIP-ViT-B-32-laion2B-s34B-b79K" ) clip_model = CLIPModel.from_pretrained( "laion/CLIP-ViT-B-32-laion2B-s34B-b79K", torch_dtype=torch.float16 ) coca_model = open_clip.create_model('coca_ViT-L-14', pretrained='laion2B-s13B-b90k').to('cuda') coca_model.dtype = torch.float16 coca_transform = open_clip.image_transform( coca_model.visual.image_size, is_train=False, mean=getattr(coca_model.visual, 'image_mean', None), std=getattr(coca_model.visual, 'image_std', None), ) coca_tokenizer = SimpleTokenizer() # Pipeline creating mixing_pipeline = DiffusionPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", custom_pipeline="clip_guided_images_mixing_stable_diffusion", clip_model=clip_model, feature_extractor=feature_extractor, coca_model=coca_model, coca_tokenizer=coca_tokenizer, coca_transform=coca_transform, torch_dtype=torch.float16, ) mixing_pipeline.enable_attention_slicing() mixing_pipeline = mixing_pipeline.to("cuda") # Pipeline running generator = torch.Generator(device="cuda").manual_seed(17) def download_image(url): response = requests.get(url) return PIL.Image.open(BytesIO(response.content)).convert("RGB") content_image = download_image("https://huggingface.co/datasets/TheDenk/images_mixing/resolve/main/boromir.jpg") style_image = download_image("https://huggingface.co/datasets/TheDenk/images_mixing/resolve/main/gigachad.jpg") pipe_images = mixing_pipeline( num_inference_steps=50, content_image=content_image, style_image=style_image, noise_strength=0.65, slerp_latent_style_strength=0.9, slerp_prompt_style_strength=0.1, slerp_clip_image_style_strength=0.1, guidance_scale=9.0, batch_size=1, clip_guidance_scale=100, generator=generator, ).images output_path = "mixed_output.jpg" pipe_images[0].save(output_path) print(f"Image saved successfully at {output_path}")
Image saved successfully at mixed_output.jpg