CVPR 2026 Workshop — Denver, Colorado
Abstraction
Heritage
NarrativeThe 3rd AI for Visual Arts workshop at CVPR 2026 explores how computer vision perceives, interprets, and models abstraction across artistic and cultural imagery—paintings, comics, sculptures, and installations that challenge assumptions of conventional models.
These stylized domains expose weaknesses in robustness, generalization, and interpretability, providing a principled framework for evaluating perception beyond realism. Can models "see" abstraction like humans, recognizing the dog-ness in a Picasso or inferring motion from comic lines?
Advance understanding of robust segmentation, saliency, and depth estimation under stylization and abstraction. Evaluate how vision models interpret abstraction and generalize beyond photorealistic imagery.
Benchmark and discuss authenticity, watermark resilience, and provenance in AI-assisted artistic processes. Examine how vision systems can detect, trace, and validate transformations in creative content.
Bridge creative and analytical domains—connecting artists, computer vision scientists, and curators to define trustworthy, human-aligned AI tools for creative and heritage contexts.
Friday, 11:59 PM AoE
Announced
Monday — for accepted full papers only
Friday — final papers and forms
Saturday, 11:59 PM AoE — new deadline
Tuesday, 11:59 PM AoE — no camera-ready required
Competition opens on CodaBench
Final submission deadline
8:30 AM – 1:00 PM · Room 4AB, Colorado Convention Center, Denver
| 08:30 – 08:35 | Welcome and Introduction |
| 08:35 – 09:00 | Keynote 1 — Luba Elliott 20 min talk + 5 min Q&A |
| 09:00 – 09:25 | Keynote 2 — Hadar Averbuch-Elor 20 min talk + 5 min Q&A |
| 09:25 – 09:55 | Sponsor Presentation & Challenge Winner Presentations 4 × 5 min talks + 5 min shared Q&A |
| 09:55 – 10:05 | Award Ceremony |
| 10:05 – 11:00 | Coffee Break & Poster Session (Exhibit Hall A) |
| 11:00 – 11:25 | Keynote 3 — Ranjay Krishna 20 min talk + 5 min Q&A |
| 11:25 – 11:50 | Keynote 4 — Pinar Yanardag 20 min talk + 5 min Q&A |
| 11:50 – 12:10 | Archival Paper Presentations 2 papers · 7 min presentation + 3 min Q&A each |
| 12:10 – 12:30 | Panel Discussion and Closing Remarks |
Novel, previously unpublished research. Accepted papers appear in the official CVPR 2026 Workshop Proceedings.
New, previously, or concurrently published research or work-in-progress. Will not appear in proceedings.
Congratulations to all authors whose work was accepted to the 3rd AI for Visual Arts Workshop. Full papers (archival) appear in the official CVPR 2026 Workshop Proceedings; extended abstracts are non-archival. A total of 10 submissions have been accepted: 4 full papers and 6 extended abstracts.
The AI4VA Image Composition Challenge (PortraitCraft) features two competitive tracks that push the boundaries of AI-driven portrait composition understanding and generation. Participants are invited to develop novel methods for analysing and composing portrait imagery in artistic contexts.

Image CompositionPortrait Understanding & Generation
PortraitCraft is a benchmark dataset for portrait composition understanding and portrait composition generation. It is designed to support models in learning and evaluating composition quality in portrait images. The dataset focuses on images with humans as the primary subjects and covers a wide range of scenarios, including single-person and multi-person portraits, as well as half-body and full-body compositions. It emphasizes key composition factors such as subject prominence, pose quality, image layout, and overall visual atmosphere.
PortraitCraft supports two task directions: portrait composition understanding and portrait composition generation. The dataset is constructed through large-model-assisted filtering combined with evaluation by professional designers, ensuring high-quality samples and fine-grained composition annotations.
Key dates for the PortraitCraft challenge. See the workshop timeline for full paper and workshop dates.
The challenge is organized into two complementary tracks. Participants may choose to focus on structured analysis of existing portraits, generation from composition-oriented specifications, or both.
Given a portrait image, predict the overall composition quality score, provide fine-grained attribute judgments, and answer a challenging visual question.
CodaBench: Track 1 competition →Given structured composition descriptions, generate portrait images that accurately realize the specified layout and aesthetic intent.
CodaBench: Track 2 competition →Participate: Track 1 on CodaBench
Portrait Composition Understanding aims to evaluate a model's ability to understand portrait composition in a structured and interpretable way. Given a portrait image, participants are required to produce three types of outputs: a predicted overall composition score, ternary judgments on dozens of predefined fine-grained composition attributes, and an answer to a carefully designed visual question that tests detailed understanding of the image content. Unlike traditional aesthetic assessment tasks that focus only on global quality prediction, this track emphasizes both attribute-level composition analysis and fine-grained visual comprehension. The goal is to encourage models that not only estimate how good a portrait is, but also explain why it is good and demonstrate that they truly understand the image.
In this track, the model takes a portrait image as input and is required to perform structured composition analysis. The task consists of a training stage and a testing stage.
(1) Training stage. The training data for Track 1 are provided in the form of image-text pairs. Each training sample consists of a portrait image and a corresponding text description. The text includes an overall composition score, the scores of 13 composition attributes, and explanations of these attribute-level judgments.
(2) Test stage. During testing, the model receives a single portrait image as input and is required to produce three types of outputs:
Together, these three outputs assess global composition evaluation, attribute-level composition reasoning, and detailed visual understanding within a unified framework.
Submissions will be evaluated using a unified score that jointly considers performance across all three required outputs. The exact weighting and implementation details of the evaluation protocol are not disclosed to ensure fairness and robustness of the benchmark.
The pair below illustrates an input portrait alongside a visualization of annotation results, highlighting the kinds of fine-grained composition cues the challenge considers.


Participate: Track 2 on CodaBench
Portrait Composition Generation evaluates a model's ability to generate portrait images from structured composition-oriented descriptions. Participants are provided with training data consisting of portrait images paired with composition-focused annotations. At test time, only the structured descriptions are given, and models must generate corresponding portraits. This track emphasizes whether models can accurately interpret and realize composition requirements in generated portraits.
In this track, the model takes structured composition-oriented descriptions as input and is required to generate corresponding portrait images. The task consists of two stages:
(1) Training stage. Participants are provided with portrait images paired with structured textual descriptions that focus on composition and aesthetic attributes such as subject placement, spatial organization, visual center, negative space, and overall composition style.
(2) Test stage. During testing, only structured composition descriptions are provided. Participants are required to generate portrait images that reflect the specified composition requirements.
The final score is computed based on the consistency between the generated images and the target structured composition descriptions. The evaluation focuses on whether the generated results accurately reflect the key composition and aesthetic characteristics specified in the descriptions.
Figure 3 shows a reference portrait; Figure 4 shows an image regenerated from a structured composition description.


Full-body portrait photograph of a young woman dancing gracefully on a long wooden pier extending into shallow turquoise sea water at sunset.
Strong central perspective composition. The pier forms leading lines toward the horizon. The subject is positioned near the center axis of the frame. She stands on one foot with the other leg lifted and bent, arms extended outward in an expressive balanced pose.
A soft, faint rainbow appears diagonally across the sky from the upper-right corner to the lower-left region of the frame, forming a subtle diagonal compositional structure that enhances visual flow without dominating the scene.
…
Each challenge track will recognise its top two leaderboard submissions with cash prizes. Winners and runners-up across both tracks will also receive a dedicated presentation slot during the workshop.
Awarded to the highest-quality full paper submission demonstrating novelty, rigour, and impact at the intersection of AI and visual arts.
Awarded for an outstanding poster presentation that effectively communicates innovative research to the workshop audience.

Cornell University & Cornell Tech
Assistant Professor in Computer Science. Her research combines images, language and 3D geometry for building multimodal perception systems that can handle the full complexity of the real 3D world.

University of Washington & Allen Institute for AI
Assistant Professor at the Allen School. Co-directs the RAIVN lab at UW and directs the PRIOR team at AI2. His research lies at the intersection of computer vision, NLP, robotics, and HCI.

Curator & Researcher, Creative AI
Curator, producer and researcher specialising in AI in the creative industries. Organising Committee of CVPR leading the CVPR Art Gallery each year.

Virginia Tech
Assistant Professor of Computer Science at Virginia Tech, leading GEMLAB on controllable, personalized generative AI. Previously a Fulbright PhD Fellow at Purdue and postdoc at MIT Media Lab. NSF CAREER awardee; Emmy-nominated Creative Director for HBO's Westworld.
All posters: 42″ × 21″ (W × H, aspect ratio 2:1, landscape format). All accepted papers present a poster. Selected oral spotlight presenters will deliver 7 min presentation + 3 min Q&A, plus poster presentation.
Posters will be: 42″ × 21″ (W × H, aspect ratio 2:1, landscape format).
Logos and poster templates for Main + Findings & Workshops can be found at Google Drive →
Feel free to use your own artwork, but we recommend a 3 or 4 column layout, and to use little text and few large but expressive figures on your poster. The poster should not be a copy-paste of your paper but provide you the "tools" to give a 5–10 minute presentation of your work to any attendee. We recommend looking at posters from previous years for inspiration. Templates and logos posted above.
CVPR 2026 offers a poster printing service for attendees who would like to collect their printed poster onsite at the Denver Convention Center. The link is active now:
https://cvprus.myprintdesk.net/login
You will be prompted during the order process to enter the presenter's name and contact information; make sure you have this information correct. It is very important to enter accurate information for proper distribution of the posters. Please list the Workshop and your paper ID number along with your full name.
All posters are printed on 8mil Satin Poster Paper, Full Colour and Single Sided. Posters are delivered rolled and delivery to the convention center is included. A PDF file is preferred. 100 DPI or vector art at full size is recommended but the file will automatically be scaled if needed. Please do not include bleed.
Online orders will close Friday, May 29 at 12:00 PM EST.
Local orders for Workshop Posters from ARC Denver are not available.
| Dates | Hours |
|---|---|
| Wednesday, June 3 & Thursday, June 4 | 7:30 AM – 3:00 PM |
| Friday, June 5 – Sunday, June 7 | 7:30 AM – 5:00 PM |
Once you pick up your poster, it is your responsibility. CVPR will not hold your poster for you. Please remember to remove your poster at the end of your session or it will be discarded.
Marissa — marissa@ctocevents.com
For questions before submission only. Printer questions will not be answered at this address.
RiotColor — riotmax@riotcolor.com
Phone: 410-992-9898
Please have your order number ready.
Lead OrganizerAssistant Professor, University of Bath, UK (Publicity Co-Chair, CVPR)
We gratefully acknowledge the generous support of our sponsors who make this workshop and its awards possible.

Interested in supporting the intersection of AI and visual arts? We welcome partners who share our vision for trustworthy, human-aligned AI in creative domains.
Get in Touch