Claude Code accepts images. Most users do not know this. They describe a broken UI (the User Interface, the parts of the screen you see and click) in three paragraphs of text, and Claude takes a guess. Or they paste an error message that lost half its context when they copied it.
You can just show Claude the screen. The model reads pixels. It sees the alignment, the colors, the icons, the error toast, the cursor position. It reads text inside the image. It can even compare two screenshots and tell you what changed.
This page covers the three ways to get an image into a session and the workflows where this matters most.
There are three methods. Two work everywhere. The clipboard method has platform quirks.
1. Drag and drop the image file into the Claude Code window
2. Paste from the clipboard
3. Reference the file path in your promptDrag and drop works on macOS, Windows, and Linux. Take a screenshot, find the file in Finder or File Explorer, drag it into your terminal where Claude Code is running. The path gets inserted into your prompt and the image is attached. On Windows, hold Shift while dragging if your screenshot wants to open in another app instead of attaching.
Clipboard paste is the fastest workflow when it works, but the keystroke differs:
macOS → Ctrl+V (NOT Cmd+V. The terminal eats that.)
Linux → Ctrl+V
Windows → drag and drop is currently more reliableThe Mac trap is real. Cmd+V is the terminal's paste shortcut, which sends the image to the terminal emulator instead of Claude Code. Use Ctrl+V to get the paste through.
File path works in every situation:
> Look at this screenshot and tell me why the popup window is misaligned: /Users/me/Desktop/bug.pngClaude opens the file and reads it. Useful in scripts, useful when you have already saved screenshots into the repo, useful as a fallback when the other methods misbehave.
Info
Tip: On macOS, Cmd+Ctrl+Shift+4 captures a screen region directly to the clipboard. No file is saved. Then Ctrl+V in Claude Code attaches it. Two keystrokes, zero files.
Info
Tip: Put the image before the text in your prompt. Claude reads images earlier in the message more reliably. So write [image] What is wrong with this layout? not What is wrong with this layout? [image].
Images shine for problems that are hard to describe in words.
UI bugs. Try describing this in words: "The submit button is two pixels too far left, and the outline around it when you click is yellow instead of blue." That sentence is wrong half the time. The screenshot is right every time.
Error messages from other apps. Logs you cannot copy easily. Small pop-up notifications that disappear before you can read them (developers call these "toasts"). Output from Docker Desktop or any other tool that shows things on screen but does not give you plain text to copy. Snap it, paste it, ask.
Design references. You want a layout that looks like the marketing page of some other product. Screenshot their page. Ask Claude to build something with the same feel.
Diagrams. Architecture sketches drawn on paper. Whiteboard photos. Flowcharts and diagrams from your team's wiki. Claude reads them and can write the matching code.
Before and after. Take a screenshot before you change anything. After your change, take another. Paste both. Ask Claude to confirm the visual difference matches what you intended.
Info
Warning: Claude is billed by token (the small pieces of text and image data the model reads), and images turn into a lot of tokens. A high-resolution screenshot is expensive. If you only need to show a button, crop the screenshot first. Send the smallest image that contains the answer. Above a certain size, the model automatically shrinks your image to fit, which can make small text illegible. That limit depends on the model: around 1568 pixels on the long edge for most, up to 2576 pixels on the newest Opus models. Crop before sending rather than relying on auto-resize.
Info
Warning: In a long conversation, every message re-sends the image data from earlier turns. One big paste at the top of a session gets more expensive with every follow-up. If you expect a long back-and-forth, send the image, get the fix, then start a fresh session. Or crop tightly so the image is small to begin with.
Here is a workflow that pays for itself the first time you use it.
Cmd+Ctrl+Shift+4 on macOS, Win+Shift+S on Windows.This loop is faster than describing the bug in words and faster than fixing it yourself when you do not know which part of the code is producing the wrong style.
A screenshot is a snapshot, not a recording. If the bug only appears during an animation or after three clicks, a single image will not capture it. Record a short screen capture, take three stills from key moments, or describe the sequence in text.
Animated GIFs are also snapshots as far as Claude is concerned. Only the first frame gets read. If you want Claude to see motion, pull out two or three still frames manually and paste them in order.
The model also cannot interact with the image. It cannot zoom, scroll, or click. If the relevant detail is below the part of the page currently on screen (the bit you have to scroll down to see), scroll first and then capture, or take multiple screenshots covering the visible regions.
Claude's spatial reasoning has limits. It reads layout well, but asking "exactly how many pixels is this from the left edge?" or "is this element at coordinate 340, 218?" will give you rough approximations, not measurements you can paste directly into code. Use screenshots to identify the problem; use your browser's dev tools to get the exact numbers.
Claude Code reads images. Drag, paste, or reference by path. On Mac use Ctrl+V, not Cmd+V. Put the image before the text in your prompt. Use images for visual bugs, error messages, design references, and before/after comparisons. Crop before sending. High-res images are expensive, and the model auto-shrinks oversized ones (the limit runs from about 1568 to 2576 pixels on the long edge depending on the model) which can make fine details unreadable. In long sessions, paste images close to the question to avoid re-sending large data in every follow-up. GIFs show only the first frame. Claude cannot interact with an image or give you precise pixel coordinates. The screenshot workflow turns "I think the button is wrong" into "here is the button, fix it." For everything else, plain text is still faster.
Next up: 07. Context Is Everything: CLAUDE.md and Projects.
Discussion
0 comments· Help other readers by sharing what worked (or didn't) for you.
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts!