← skills directory○ community

Skill · charliehills

youtube-thumbnail.

Thumbnail before script. Sells the video before anyone hears a word.

$ git clone https://github.com/charlie947/social-media-skills.git ~/.claude/skills/social-media-skills

The thumbnail-first workflow is the same instinct as drafting a hook before drafting a post. The artifact that has to do the most work for the video is the thumbnail, and writing the script before testing whether the thumbnail can carry the title is the most expensive ordering mistake a YouTube creator can make. youtube-thumbnail is the skill that surfaces the thumbnail decision early, before the script is committed.

The economics of YouTube reward the thumbnail more than the title, more than the script, more than the production. youtube-thumbnail is the skill that takes that observation seriously by treating the thumbnail as the artifact that has to be approved first, before the writer commits to a script that the thumbnail might not be able to carry.

§01What it does

The skill checks the project for a reference photo configuration, looking in thumbnail-config.md, brand-kit.md, and about-me.md in that order. The reference photo is the load-bearing input: it stays consistent across videos so the channel develops a visual signature. If no reference exists, the skill asks for one (a clear headshot with distinctive lighting and an expression the creator plans to reuse).

Then it asks for the video title and an emotional tone (shock, curious, confident, frustrated). With those inputs it applies a fixed set of high-CTR thumbnail rules: face fills thirty to fifty percent of the frame, three to five words of large text maximum, two dominant colours (brand primary plus a high-contrast accent like yellow, red, or cyan), one clear focal element besides the face, no small text, no logo bottom-right (the watch-time icon sits there).

It produces a brief naming the composition, text placement, palette, and supporting element. Once the writer approves, it outputs the Gemini prompt with the reference image baked in.

§02The thumbnail is the title's editor

Most creators write the title and then ask the thumbnail to follow. The interesting move is to reverse the order. A title that cannot be reduced to three to five words on a thumbnail is usually a title that is doing too much. Forcing the thumbnail constraint first edits the title down to its load-bearing claim, which is the part that actually decides whether the video gets clicked.

The reference photo discipline matters too. A YouTube channel with eight different faces in its thumbnails reads as eight different channels to the algorithm. The reference photo enforces visual continuity even as titles, palettes, and supporting elements rotate.

§03Setup

# Trigger phrases:
#   "thumbnail"
#   "youtube thumbnail"
#   "build me a thumbnail"

Like the other Gemini-output skills in the bundle, the prompt goes into a Gemini chat with Create Image enabled. The reference photo gets attached to that chat; the prompt instructs Gemini to use it as the source for the creator's face.

◆ pull quote

A title that cannot be reduced to three to five words on a thumbnail is usually a title that is doing too much.

§04Caveats

The two-colour rule is strict. Three colours produce a thumbnail that competes with itself at small sizes. Resist the urge to add a third even if the brand kit names one.

The skill will not negotiate text length. Six words is the absolute ceiling, five is the working cap, three to four is ideal. Hook-generator pairs naturally here for testing variations on the hook phrase before locking the thumbnail.

◇ summary · field notes
$ vibgineer summarize youtube-thumbnail
  1. 01
    Inputs
    • reference photo of creator
    • video title (typed or suggested)
    • emotional tone
  2. 02
    Best practices
    • face = 30 to 50% of frame
    • 3 to 5 words of large text
    • two dominant colours
    • one focal element + face
  3. 03
    Brief
    • composition + gaze direction
    • text placement
    • palette + supporting element
  4. 04
    Prompt
    • Gemini-ready output
    • reference image baked in
    • retainer for branded reuse
✓ 1 thumbnail · sells the video before anyone hears a word.
Summary: Step 01: Inputs (reference photo of creator, video title (typed or suggested), emotional tone). Step 02: Best practices (face = 30 to 50% of frame, 3 to 5 words of large text, two dominant colours, one focal element + face). Step 03: Brief (composition + gaze direction, text placement, palette + supporting element). Step 04: Prompt (Gemini-ready output, reference image baked in, retainer for branded reuse). ✓ 1 thumbnail · sells the video before anyone hears a word.