Shelves filled with well organized items

Critical Insights from the Website Content Inventory

If you’ve ever taken part in a website redesign project, you’ve probably heard about that essential tool, the content audit. Before you conduct a content audit, however, you should start with a website content inventory.

While a content audit is a critical evaluation of the content on your website, a content inventory is a comprehensive listing of that content, alongside essential information such as titles, URLs, metadata, word counts, date created, date last modified, and redirects.

But a content inventory isn’t just prep work for an audit; done right, it can be a valuable source of insight about your website and a key diagnostic tool to help guide your redesign strategy. A content inventory offers a chance to conduct some quantitative analysis of your content before engaging in the qualitative assessment of a content audit.

What can we learn from a website content inventory?

How much content you actually have on your website

It’s not uncommon to pull an inventory and find that you have many, many more pages on your site than you expected. This is especially common if your site is a few years old and you have a distributed governance model with many content contributors. Without ongoing auditing and established protocols for deleting content, sites tend to sprawl over time.

A content inventory will list all the pages on your site, including those excluded from navigation but never unpublished. For example, I once found a guide to local restaurants and hotels on an academic department site. The website project stakeholders were astonished; the page had been created a few years earlier when they were hosting a conference. It had been removed from the navigation, but the page itself was still live — and getting a surprising amount of visits, thanks to its SEO-friendly layout resplendent with high-quality keywords. 

Such ghost pages indicate that your training needs to address when and how to remove a page from the site, not just the navigation.

Potential governance and accessibility issues

A content inventory will also tell you how many PDFs you have on your site, which are not ideal from an accessibility standpoint. If your inventory reveals a large quantity of PDFs, you should plan to audit them for currency, relevance, and accessibility. An abundance of PDFs can also indicate governance issues pertaining to training or resource allocation:

  • Do your editors understand the user experience and SEO value of publishing content as HTML pages instead of PDFs? 

  • Are they defaulting to posting PDFs instead of building out pages because they lack adequate time for content publishing? 

  • Or do they lack authority? Sometimes a senior stakeholder will insist that a document be posted as a PDF, and the content editor doesn’t feel empowered to resist.

Your governance should account for how the web team, as website authorities, can step in to advocate for better solutions.

If your site has structural problems

A content inventory can help you identify structural problems with your website by giving you a birds-eye view of key information such as URL paths and headers. 

Sort by URL and scan down the column, then ask yourself these questions:

  • Does your site have a consistent URL structure?

  • Are your pages nicely organized into relevant subfolders, or do you have a lot of pages floating around at the root level? 

  • Is there a relatively even balance of pages within top-level subfolders? 

The answers to these questions can point to problems with information architecture and, once again, governance. 

If you have a lot of pages outside of the primary subfolders, you need to revisit your sitemap to address why so much of your content doesn’t fit in your navigational structure. The answer may stem from a lack of governance: 

  • Who owns the menu? 

  • Who can add new pages and where? 

  • Do you need to adjust the permissions for content editors to restrict where or whether they can add new pages?

If you have a lot more pages in one of the subfolders, that again suggests you need to review your information architecture. A category like “Student Resources” might be too broad and might need to be split into other categories. Your sitemap might have been neatly organized when you launched your site, but without consistent auditing between redesigns, some parts of the sites may swell as programming and communication needs evolve.

The content inventory can also call attention to structural challenges at the page level. Consider your headers:

  •  Does every page have one and only one H1? 

  • Is each page’s H1 unique within the site? 

Empty H1 fields may point to development issues. Look into why the H1 field wasn’t required when your site was built, and whether you can now adjust it to ensure that all pages have an H1. 

You should also look at other types of headers in your inventory to identify usage patterns. 

  • Are you using H2s appropriately? 

  • Are there pages that don’t use any subheads? They would probably benefit from some revision to make the page more scannable for users. 

  • Are there pages that skip H2 and go straight to H3? That’s an accessibility failure that needs to be addressed.

Opportunities to improve SEO

A content inventory can also reveal opportunities to improve your search engine optimization.

Often an inventory will include title length (in characters). Your titles should be no more than 70 characters. Filter your inventory to find all the pages with titles that exceed 70 characters —  those page titles must be revised. Conversely, look for all pages with short titles, with less than 40 characters. You have room to add SEO-boosting terms to those titles.

Your inventory may also show how many images lack alt text. Since alt text is searchable, good descriptive alt text can help improve your search results. 

All images need an alt tag to be accessible. Most images need alt text in that tag to describe what the image shows; the alt text will show if the image doesn’t download, and will be presented to screen reader users. However, images that are purely decorative, that contain no content and are used for non-informative purposes such as to control layout, can have empty alt attributes (alt="").

So, having some images with empty alt tags is acceptable, as long as they are the right ones. But your content inventory can show you if you have an excessive amount of images without alt text. If a large proportion of your alt tags are empty, you should take a closer look to see why that is. Functional (non-decorative) images that lack alt text point to governance issues:

  • Do you have a policy about accessibility? 

  • How is it enforced? 

  • How are you conducting quality assurance to find and correct accessibility errors? 

  • Do your editors need more training about why alt text matters?

How to conduct a website content inventory

Okay, you’re sold on the benefits of a content inventory and excited about what you can learn. So how do you actually get started?

First, figure out what questions you’re going to ask. Are you most concerned about structure, SEO, outdated content, or all of the above?

Next, figure out how you will actually create your inventory. Your CMS may have an export function, but third-party tools may provide more data. ContentWRXScreaming Frog, and URL Profiler are popular options. However, subscription prices may be prohibitive. If you have a service contract with an agency, they likely have subscriptions to one or more of these tools, so speak to your account manager about adding an inventory to your project list. 

Some content inventory tools can be connected to your Google Analytics account, to add key information like page visits, times on page, and bounce rate to your inventory. This will allow you to set measurable KPIs and keep track of how your site compares to the rest of the industry.

If possible, allot time for cleaning and coding the data. You may want to add attributes like template type or site section to make sorting and filtering easier. If you have consistent use of templates in your current site — like a landing page vs general page — then some time coding can help you look for differences between content types. This can be a time consuming manual process, however. If your site structure is consistent, some tagging could be done programmatically using IF THEN functions in the spreadsheet. 

Get comfortable with pivot tables. The real fun in conducting a content inventory comes in finding the patterns that might not be immediately apparent when you are viewing a spreadsheet with thousands of rows and dozens of columns. Pivot tables can help you find the relationships between data. For example, you might ask questions like, how does the average page length differ between content types? A pivot table helps you pull that data together in a consumable manner.

Now dive in! Prepare to see your site from a whole new perspective. And once you’ve finished your content inventory, don’t forget to move on to the next step: the content audit.