A content schema is the structure every record of a given type shares. For articles, it might include title, slug, excerpt, body, publish date, category, tags, and SEO fields. For categories, a different set. The schema is the contract between your content and your templates.
Most people design their schema by adding fields as they need them. This works until it does not — until you have a hundred article files that are inconsistent with each other, templates that handle optional fields with special cases, and AI tools that produce unpredictable output because the pattern is not clear.
A schema designed deliberately avoids most of these problems.
Start with what every record needs, not what some records need
The first principle of schema design is distinguishing required fields from optional ones — and being honest about which is which.
A required field should be present in every record of that type, always. If you are tempted to make a field required but know there will be exceptions, it is not actually required — it is optional with a strong default. Model it that way.
Optional fields should have a clear purpose for the records that use them. An optional field that you are not sure you will ever actually use is clutter. Add it when you need it, not speculatively.
Name fields for what they contain, not how they render
Field names should describe the data, not the presentation. excerpt is better than card_description because the data is the excerpt — how it renders is the template's concern. sidebar_profile is better than right_sidebar because the value is a profile slug, not a layout instruction.
This matters when your templates change. If you name fields for how they render, every template change potentially requires schema changes. If you name fields for what they contain, templates change without touching the schema.
Be consistent about arrays vs strings
One of the most common sources of schema drift is inconsistency between arrays and strings for fields that sometimes have one value and sometimes have many. Tags are a common example — sometimes an article has one tag, sometimes it has several.
Pick one representation and use it everywhere. If tags are sometimes one value and sometimes many, always use an array — even for single values. ["single-tag"] is more consistent than toggling between a string and an array based on count. Templates are simpler, validation is simpler, and AI tools produce consistent output.
Include a canonical example in your docs
The most useful piece of schema documentation is a canonical example — a complete, correctly structured record that represents the standard case. Not a spec document, not a list of fields and types. An actual example file that shows exactly what a correct record looks like.
When you or an AI assistant creates a new record, the canonical example is the pattern to match. It is faster to reference than a spec and harder to misinterpret.
Plan for what you know is coming
A schema should not try to anticipate every possible future requirement. But it should include the fields you know you will need based on where the site is going. If you know you will eventually add featured images, add a featured_image field now — empty string is fine — rather than retrofitting it across a hundred files later.
The test is: am I adding this because I know I will use it, or because I might use it someday? Add the former. Skip the latter until you actually need it.