06. Show, Don't Tell: Screenshots and Image Input

06. Show, Don't Tell: Screenshots and Image Input — Yaiment

A Picture Is Worth a Thousand Prompts

Claude Code accepts images. Most users do not know this. They describe a broken UI (the User Interface, the parts of the screen you see and click) in three paragraphs of text, and Claude takes a guess. Or they paste an error message that lost half its context when they copied it.

You can just show Claude the screen. The model reads pixels. It sees the alignment, the colors, the icons, the error toast, the cursor position. It reads text inside the image. It can even compare two screenshots and tell you what changed.

This page covers the three ways to get an image into a session and the workflows where this matters most.

Three Ways to Add an Image

There are three methods. Two work everywhere. The clipboard method has platform quirks.

1. Drag and drop the image file into the Claude Code window
2. Paste from the clipboard
3. Reference the file path in your prompt

Drag and drop works on macOS, Windows, and Linux. Take a screenshot, find the file in Finder or File Explorer, drag it into your terminal where Claude Code is running. The path gets inserted into your prompt and the image is attached. On Windows, hold Shift while dragging if your screenshot wants to open in another app instead of attaching.

Clipboard paste is the fastest workflow when it works, but the keystroke differs:

macOS    → Ctrl+V    (NOT Cmd+V. The terminal eats that.)
Linux    → Ctrl+V
Windows  → drag and drop is currently more reliable

The Mac trap is real. Cmd+V is the terminal's paste shortcut, which sends the image to the terminal emulator instead of Claude Code. Use Ctrl+V to get the paste through.

File path works in every situation:

> Look at this screenshot and tell me why the popup window is misaligned: /Users/me/Desktop/bug.png

Claude opens the file and reads it. Useful in scripts, useful when you have already saved screenshots into the repo, useful as a fallback when the other methods misbehave.

Info

Tip: On macOS, Cmd+Ctrl+Shift+4 captures a screen region directly to the clipboard. No file is saved. Then Ctrl+V in Claude Code attaches it. Two keystrokes, zero files.

Info

Tip: Put the image before the text in your prompt. Claude reads images earlier in the message more reliably. So write [image] What is wrong with this layout? not What is wrong with this layout? [image].

When Images Are the Right Tool

Images shine for problems that are hard to describe in words.

UI bugs. Try describing this in words: "The submit button is two pixels too far left, and the outline around it when you click is yellow instead of blue." That sentence is wrong half the time. The screenshot is right every time.

Error messages from other apps. Logs you cannot copy easily. Small pop-up notifications that disappear before you can read them (developers call these "toasts"). Output from Docker Desktop or any other tool that shows things on screen but does not give you plain text to copy. Snap it, paste it, ask.

Design references. You want a layout that looks like the marketing page of some other product. Screenshot their page. Ask Claude to build something with the same feel.

Diagrams. Architecture sketches drawn on paper. Whiteboard photos. Flowcharts and diagrams from your team's wiki. Claude reads them and can write the matching code.

Before and after. Take a screenshot before you change anything. After your change, take another. Paste both. Ask Claude to confirm the visual difference matches what you intended.

Info

Warning: Claude is billed by token (the small pieces of text and image data the model reads), and images turn into a lot of tokens. A high-resolution screenshot is expensive. If you only need to show a button, crop the screenshot first. Send the smallest image that contains the answer. Above a certain size, the model automatically shrinks your image to fit, which can make small text illegible. That limit depends on the model: around 1568 pixels on the long edge for most, up to 2576 pixels on the newest Opus models. Crop before sending rather than relying on auto-resize.

Info

Warning: In a long conversation, every message re-sends the image data from earlier turns. One big paste at the top of a session gets more expensive with every follow-up. If you expect a long back-and-forth, send the image, get the fix, then start a fresh session. Or crop tightly so the image is small to begin with.

A Visual Debugging Workflow

Here is a workflow that pays for itself the first time you use it.

Run the broken UI in a browser or app.
Open Claude Code in the same project.
Capture the broken state. Cmd+Ctrl+Shift+4 on macOS, Win+Shift+S on Windows.
Paste or drag into Claude Code.
Prompt: "This is what the dashboard looks like right now. The total card on the right is supposed to be aligned with the cards on the left. Find and fix the CSS."
Claude reads the screenshot, finds the file, applies the fix.
Reload the browser. Take another screenshot. Paste it. "Did this fix the alignment?"

This loop is faster than describing the bug in words and faster than fixing it yourself when you do not know which part of the code is producing the wrong style.

What Images Do Not Solve

A screenshot is a snapshot, not a recording. If the bug only appears during an animation or after three clicks, a single image will not capture it. Record a short screen capture, take three stills from key moments, or describe the sequence in text.

Animated GIFs are also snapshots as far as Claude is concerned. Only the first frame gets read. If you want Claude to see motion, pull out two or three still frames manually and paste them in order.

The model also cannot interact with the image. It cannot zoom, scroll, or click. If the relevant detail is below the part of the page currently on screen (the bit you have to scroll down to see), scroll first and then capture, or take multiple screenshots covering the visible regions.

Claude's spatial reasoning has limits. It reads layout well, but asking "exactly how many pixels is this from the left edge?" or "is this element at coordinate 340, 218?" will give you rough approximations, not measurements you can paste directly into code. Use screenshots to identify the problem; use your browser's dev tools to get the exact numbers.

Summary

Claude Code reads images. Drag, paste, or reference by path. On Mac use Ctrl+V, not Cmd+V. Put the image before the text in your prompt. Use images for visual bugs, error messages, design references, and before/after comparisons. Crop before sending. High-res images are expensive, and the model auto-shrinks oversized ones (the limit runs from about 1568 to 2576 pixels on the long edge depending on the model) which can make fine details unreadable. In long sessions, paste images close to the question to avoid re-sending large data in every follow-up. GIFs show only the first frame. Claude cannot interact with an image or give you precise pixel coordinates. The screenshot workflow turns "I think the button is wrong" into "here is the button, fix it." For everything else, plain text is still faster.

Next up: 07. Context Is Everything: CLAUDE.md and Projects.

Discussion

A Picture Is Worth a Thousand Prompts

Three Ways to Add an Image

When Images Are the Right Tool

A Visual Debugging Workflow

What Images Do Not Solve

Summary