CVPR 2026 Workshop — Denver, Colorado

AI for Visual ArtsThe 3rd Edition

June 4, 2026 (Tentative)Colorado Convention CenterHalf-Day Workshop
SCROLL
01
Picasso - Girl with a Mandolin (1910)Abstraction
Leonardo da Vinci Museum ExhibitionHeritage
Roy Lichtenstein - Whaam!Narrative

Where machines learn to perceive abstraction

The 3rd AI for Visual Arts workshop at CVPR 2026 explores how computer vision perceives, interprets, and models abstraction across artistic and cultural imagery—paintings, comics, sculptures, and installations that challenge assumptions of conventional models.

These stylized domains expose weaknesses in robustness, generalization, and interpretability, providing a principled framework for evaluating perception beyond realism. Can models "see" abstraction like humans, recognizing the dog-ness in a Picasso or inferring motion from comic lines?

01

Perception Under Abstraction

Advance understanding of robust segmentation, saliency, and depth estimation under stylization and abstraction. Evaluate how vision models interpret abstraction and generalize beyond photorealistic imagery.

02

Provenance & Authenticity

Benchmark and discuss authenticity, watermark resilience, and provenance in AI-assisted artistic processes. Examine how vision systems can detect, trace, and validate transformations in creative content.

03

Cross-Disciplinary Exchange

Bridge creative and analytical domains—connecting artists, computer vision scientists, and curators to define trustworthy, human-aligned AI tools for creative and heritage contexts.

02 Research Areas

Topics of Interest

Visual Arts SegmentationDepth Estimation in Visual ArtsSaliency Detection in Narrative Visual FormsVision-Language Alignment in Arts / Multimodal LearningGenerative Models and 3D Modeling in Visual ArtsProvenance TrackingComputer Vision for Museum Analysis and RestorationAI Ethics in ArtImage Composition Visual Arts SegmentationDepth Estimation in Visual ArtsSaliency Detection in Narrative Visual FormsVision-Language Alignment in Arts / Multimodal LearningGenerative Models and 3D Modeling in Visual ArtsProvenance TrackingComputer Vision for Museum Analysis and RestorationAI Ethics in ArtImage Composition
001Robust segmentation & depth estimation in stylized imagery
002Saliency detection in narrative visual forms
003Vision–language alignment in art interpretation
004Generative models & 3D modeling from artworks
005Provenance tracking & authenticity verification
006Watermark robustness in AI-generated art
007Computer vision for museum analysis & digitization
008Art restoration & preservation using computer vision
009AI ethics, authorship & responsible AI in creative contexts
010Multimodal reasoning in art & cultural heritage
011Art & animation
012Image composition for visual arts
013Any AI × Visual Art intersection
03Timeline

Important Dates

June42026 (tentative)
Colorado Convention Center
Denver, Colorado
United States

Full Papers (Archival)- Deadline Passed

Mar 13, 2026

Full Paper Submission Deadline

Friday, 11:59 PM AoE

Mar 30, 2026

Acceptance Notification

Announced

Mar 30, 2026

Camera-Ready Site Opens

Monday — for accepted full papers only

Apr 10, 2026

Camera-Ready & Copyright Due

Friday — final papers and forms

Extended Abstracts (Non-Archival)- Accepting Submissions

Apr 13, 2026

Extended Abstract Submission

Monday, 11:59 PM AoE

Apr 30, 2026

Acceptance Notification

Thursday, 11:59 PM AoE — no camera-ready required

PortraitCraft Challenge- Accepting Submissions

Apr 5, 2026

Challenge Start

Competition opens on CodaBench

May 17, 2026

Challenge End

Final submission deadline

Jun 4 PM session (Tentative)

Workshop Day

AI4VA @ CVPR 2026, Denver

All deadlines are AoE (Anywhere on Earth). Full papers appear in CVPR Workshop Proceedings. Extended abstracts are non-archival.
04Call for Papers

Submit Your Work

Track I — Archival

Full Papers- Deadline Passed

Novel, previously unpublished research. Accepted papers appear in the official CVPR 2026 Workshop Proceedings.

Length
5–8 pages
Format
CVPR 2026 Style
Review
Double-blind
Supplementary
Optional
Track II — Non-Archival

Extended Abstracts- Accepting Submissions

New, previously, or concurrently published research or work-in-progress. Will not appear in proceedings.

Length
2 pages
Format
CVPR 2026 Style
Review
Double-blind
Camera-Ready
Not required
Submission Portal

Submit via OpenReview

There are two tracks — please submit to the appropriate track (Extended Abstract or Long Paper) on the submission site.

⚠ Select the correct track when submitting

Submit on OpenReview ↗

Presentation Guidelines

All posters: portrait orientation, max 90 × 180 cm. All accepted papers present a poster. Selected oral spotlight presenters: 7 min talk + Q&A, plus poster presentation.

05Competition

PortraitCraft Challenge

The AI4VA Image Composition Challenge (PortraitCraft) features two competitive tracks that push the boundaries of AI-driven portrait composition understanding and generation. Participants are invited to develop novel methods for analysing and composing portrait imagery in artistic contexts.

PortraitCraft logo

PortraitCraft Challenge

Image CompositionPortrait Understanding & Generation

DatasetScheduleTracks OverviewTrack 1 — UnderstandingTrack 2 — Generation
Benchmark

Dataset Introduction

PortraitCraft is a benchmark dataset for portrait composition understanding and portrait composition generation. It is designed to support models in learning and evaluating composition quality in portrait images. The dataset focuses on images with humans as the primary subjects and covers a wide range of scenarios, including single-person and multi-person portraits, as well as half-body and full-body compositions. It emphasizes key composition factors such as subject prominence, pose quality, image layout, and overall visual atmosphere.

PortraitCraft supports two task directions: portrait composition understanding and portrait composition generation. The dataset is constructed through large-model-assisted filtering combined with evaluation by professional designers, ensuring high-quality samples and fine-grained composition annotations.

Dataset & Baseline

Timeline

Challenge Period

Key dates for the PortraitCraft challenge. See the workshop timeline for full paper and workshop dates.

Start
April 5, 2026
End
May 17, 2026
Competition

Two Challenge Tracks

The challenge is organized into two complementary tracks. Participants may choose to focus on structured analysis of existing portraits, generation from composition-oriented specifications, or both.

Challenge Track 1

Portrait Composition Understanding

Given a portrait image, predict the overall composition quality score, provide fine-grained attribute judgments, and answer a challenging visual question.

CodaBench: Track 1 competition →
Challenge Track 2

Portrait Composition Generation

Given structured composition descriptions, generate portrait images that accurately realize the specified layout and aesthetic intent.

CodaBench: Track 2 competition →
Track 1

Portrait Composition Understanding

Participate: Track 1 on CodaBench

Introduction

Portrait Composition Understanding aims to evaluate a model's ability to understand portrait composition in a structured and interpretable way. Given a portrait image, participants are required to produce three types of outputs: a predicted overall composition score, ternary judgments on dozens of predefined fine-grained composition attributes, and an answer to a carefully designed visual question that tests detailed understanding of the image content. Unlike traditional aesthetic assessment tasks that focus only on global quality prediction, this track emphasizes both attribute-level composition analysis and fine-grained visual comprehension. The goal is to encourage models that not only estimate how good a portrait is, but also explain why it is good and demonstrate that they truly understand the image.

Model Input and Output

In this track, the model takes a portrait image as input and is required to perform structured composition analysis. The task consists of a training stage and a testing stage.

(1) Training stage. The training data for Track 1 are provided in the form of image-text pairs. Each training sample consists of a portrait image and a corresponding text description. The text includes an overall composition score, the scores of 13 composition attributes, and explanations of these attribute-level judgments.

(2) Test stage. During testing, the model receives a single portrait image as input and is required to produce three types of outputs:

Together, these three outputs assess global composition evaluation, attribute-level composition reasoning, and detailed visual understanding within a unified framework.

Evaluation Metrics

Submissions will be evaluated using a unified score that jointly considers performance across all three required outputs. The exact weighting and implementation details of the evaluation protocol are not disclosed to ensure fairness and robustness of the benchmark.

Example

The pair below illustrates an input portrait alongside a visualization of annotation results, highlighting the kinds of fine-grained composition cues the challenge considers.

This figure is for illustration purposes only. Please refer to the official challenge documentation for detailed evaluation specifications.
Example input portrait for Track 1
Figure 1 — Input Image
Annotation results for Track 1
Figure 2 — Annotation Results
Track 2

Portrait Composition Generation

Participate: Track 2 on CodaBench

Introduction

Portrait Composition Generation evaluates a model's ability to generate portrait images from structured composition-oriented descriptions. Participants are provided with training data consisting of portrait images paired with composition-focused annotations. At test time, only the structured descriptions are given, and models must generate corresponding portraits. This track emphasizes whether models can accurately interpret and realize composition requirements in generated portraits.

Model Input and Output

In this track, the model takes structured composition-oriented descriptions as input and is required to generate corresponding portrait images. The task consists of two stages:

(1) Training stage. Participants are provided with portrait images paired with structured textual descriptions that focus on composition and aesthetic attributes such as subject placement, spatial organization, visual center, negative space, and overall composition style.

(2) Test stage. During testing, only structured composition descriptions are provided. Participants are required to generate portrait images that reflect the specified composition requirements.

Evaluation Metrics

The final score is computed based on the consistency between the generated images and the target structured composition descriptions. The evaluation focuses on whether the generated results accurately reflect the key composition and aesthetic characteristics specified in the descriptions.

Example

Figure 3 shows a reference portrait; Figure 4 shows an image regenerated from a structured composition description.

This figure is for illustration purposes only. Please refer to the official challenge documentation for detailed evaluation specifications.
Original reference portrait for Track 2
Figure 3 — Original Image
Regenerated portrait
Figure 4 — Regenerated Image

Structured Description (Excerpt)

Full-body portrait photograph of a young woman dancing gracefully on a long wooden pier extending into shallow turquoise sea water at sunset.

Strong central perspective composition. The pier forms leading lines toward the horizon. The subject is positioned near the center axis of the frame. She stands on one foot with the other leg lifted and bent, arms extended outward in an expressive balanced pose.

A soft, faint rainbow appears diagonally across the sky from the upper-right corner to the lower-left region of the frame, forming a subtle diagonal compositional structure that enhances visual flow without dominating the scene.

professional photography, cinematic lighting, diagonal composition, leading lines, central perspective symmetry, elegant movement, balanced composition, subtle rainbow accent, high aesthetic quality, sharp details, realistic style.

Co-organised byMeitu
06Recognition

Workshop Awards

🏆

Best Paper Award

$1,000

Awarded to the highest-quality full paper submission demonstrating novelty, rigour, and impact at the intersection of AI and visual arts.

Sponsored by Meitu
🎖️

Best Poster Award

$500

Awarded for an outstanding poster presentation that effectively communicates innovative research to the workshop audience.

Sponsored by Meitu
All winners will be felicitated in the workshop award ceremony with certificates.
07Keynotes

Invited Speakers

Confirmed
Hadar Averbuch-Elor

Hadar Averbuch-Elor

Cornell University & Cornell Tech

Assistant Professor in Computer Science. Her research combines images, language and 3D geometry for building multimodal perception systems that can handle the full complexity of the real 3D world.

Confirmed
Ranjay Krishna

Ranjay Krishna

University of Washington & Allen Institute for AI

Assistant Professor at the Allen School. Co-directs the RAIVN lab at UW and directs the PRIOR team at AI2. His research lies at the intersection of computer vision, NLP, robotics, and HCI.

Confirmed
Luba Elliott

Luba Elliott

Curator & Researcher, Creative AI

Curator, producer and researcher specialising in AI in the creative industries. Honorary Senior Research Fellow at UCL Centre for Artificial Intelligence.

TBC
?

To Be Confirmed

TBC

The fourth keynote speaker will be announced soon. Please check back for updates.

08Organizers

Organizing Committee

Deblina BhattacharjeeLead Organizer

Deblina Bhattacharjee

Assistant Professor, University of Bath, UK (Publicity Co-Chair, CVPR)

Yin ZhangOrganizer

Iris (Yin) Zhang

PhD in Translation Studies, University of Geneva

Bingchen ZhaoOrganizer

Bingchen Zhao

PhD in Computer Science, University of Edinburgh

Haoxiang LiChallenge Co-Organizer

Haoxiang Li

Meitu Inc. (Pixocial Technology)

Luoqi LiuChallenge Co-Organizer

Luoqi Liu

VP of Technology, Meitu Inc.

Support

Sponsors


Get in Touch
Contact

Questions?

db2466[at]bath[dot]ac[dot]uk