Research · Jun 25, 2026 · 10 min read · by the Pressfold team

Public datasets that make great PR stories

You do not always have a proprietary dataset to work with, and you do not always need one. Governments, statistical agencies, regulators, and research bodies publish enormous quantities of data for free, and most of it never gets touched by anyone outside academia. For a PR team, that neglect is an opportunity. The same census table or licensing register that a reporter would never bother to download can become the spine of a story that earns national coverage, precisely because nobody else thought to look. The raw material is sitting in plain sight. The value you add is the angle, the analysis, and the framing.

This piece covers where to find public data that is actually usable for stories, how to turn a dry government table into a narrative a journalist will run, and how to credit your sources so your work holds up to scrutiny. The discipline here is not statistical wizardry. It is curiosity, patience, and a willingness to read a methodology note before you publish a headline.

Why public data is underused in PR

Most communications teams assume public data is either too boring or too complicated to build a campaign on. Both assumptions are usually wrong. The data feels boring because it is published in the format that suits the agency that collected it, not the format that suits a story. A spreadsheet with forty columns and cryptic field names looks like work, so people skip it. But buried in those columns is often a single comparison that nobody has made before, and a single fresh comparison is all a data story needs.

The complexity objection has a similar answer. You rarely need advanced statistics to find a publishable pattern in open data. You need to filter, sort, group, and compare. If you can build a pivot table, you can find a story. The journalists you are pitching are not statisticians either, so a clearly explained simple finding will almost always travel further than a sophisticated analysis nobody can follow. The bar is clarity, not cleverness.

There is also a competitive reason to lean on public data that PR teams rarely articulate. Because the source is open, anyone could in principle find the same story, which feels like a weakness but is actually a strength. It means a reporter can verify your finding independently in minutes, and verifiability is exactly what makes them comfortable running it. A proprietary number they cannot check requires them to trust you; an open dataset lets them trust the data. The work you are paid for is not access to the numbers, which are free to everyone. It is the noticing, the framing, and the willingness to do the reading that nobody else did.

Where to find data worth using

Public data sits in more places than the obvious national portal. The trick is knowing which sources reliably produce story-shaped material and which mostly produce noise. A few categories worth knowing well:

National statistics agencies. Census data, labour figures, household spending, and population projections are gold because they describe everyone. Their downside is lag, so check the reference period before you frame anything as current.
Sector regulators. Bodies overseeing finance, telecoms, energy, transport, and health publish complaint logs, registers, and performance data that map directly onto stories readers care about. Complaints data in particular is consistently newsworthy.
Open government portals. Many countries run a central catalogue where ministries deposit datasets. Quality varies wildly, but the search functions let you scan a huge surface area quickly.
Local and city authorities. Planning applications, parking fines, licensing, and spending records create stories that local press will run eagerly, and local coverage often gets picked up nationally.
International bodies. Cross-country datasets let you build league tables and "where does your country rank" angles, which reliably attract domestic coverage.

When you find a candidate, check three things before you invest time: when it was last updated, how often it is refreshed, and whether the methodology is documented. A dataset with no methodology note is a dataset you cannot defend, and one you should treat with caution no matter how tempting the numbers look.

Turning a government table into a story

A raw public table is data, not a story. The conversion happens when you ask it a human question. Instead of "what does this licensing register contain," ask "where are the most and fewest licences per head, and why might that be." Instead of "what are the complaint totals," ask "which sector generates the most complaints relative to its size, and is that getting worse." The question turns a flat table into a comparison, and the comparison is what makes news.

Some reliable moves for finding the angle inside open data:

Normalise it. Raw totals favour big places. Divide by population, by area, or by another sensible denominator and you surface the genuinely interesting outliers rather than just the largest cities.
Rank it. Readers love a league table. Ordering regions, sectors, or years gives a reporter an instant structure and gives local outlets a reason to localise.
Compare two periods. Change over time creates a trend, and a trend gives the story a reason to exist now.
Cross two datasets. The strongest open-data stories often join two public sources, for example spending data against outcome data, to reveal a relationship neither table shows alone.

Once you have a finding, pressure-test it the way a journalist's editor will. Is the pattern large enough to matter? Could it be explained by something dull, like a change in how the data was collected? Have you read enough of the methodology to know what the numbers actually mean? Building this scrutiny into your own process is the difference between a story that survives a fact-check and one that quietly falls apart after publication, taking your credibility with it.

The most dangerous trap in open data is the boundary change. Statistical agencies redraw regions, regulators alter what counts as a reportable complaint, and survey questions get reworded between waves. When that happens, a number can leap or collapse for reasons that have nothing to do with the real world, and an unwary analyst will pitch the artefact as a trend. Before you frame any change over time as news, confirm that the thing being counted was counted the same way in both periods. This single check catches most of the embarrassing errors that get data-PR teams a reputation for sloppiness, and it takes ten minutes of reading the release notes.

Crediting sources and protecting your credibility

The fastest way to lose a journalist's trust is to present public data as if it were your own discovery, or to obscure where a number came from. The opposite approach builds you a reputation as a reliable source. Always name the dataset, the publishing body, and the time period clearly, both in your pitch and in any asset you hand over. Reporters need to attribute the data, and making that easy for them is a small favour they remember.

Good attribution practice is straightforward but easy to neglect:

Cite the original, not an aggregator. If you found the figure on a third-party site, trace it back to the source that published it and link there. Aggregators introduce errors.
State the reference period. "Latest available data" is meaningless in six months. Name the year or quarter explicitly.
Quote the methodology limits. If the agency notes a caveat, repeat it. A pitch that acknowledges a limitation reads as honest, not weak.
Show your own working. If you normalised, ranked, or combined datasets, say exactly how. Offer to share your calculation file. Reporters who can check your maths will trust your next pitch faster.

Transparency is not just ethical hygiene; it is a competitive advantage. The studies that get cited repeatedly are the ones reporters know they can rely on without getting burned. A clean source line and an honest method note do more for your long-term coverage rate than any amount of clever framing, because they turn a one-off placement into a standing relationship.

Making the data easy for a journalist to run

Finding the story is half the job. The other half is packaging it so a busy reporter can publish without redoing your work. That means handing over a short summary of the finding, the headline numbers, a clean version of the underlying data, and a chart or two they can drop straight into the page. The more friction you remove, the more likely the story runs and the more accurately it gets reported.

Keep the data file tidy and self-explanatory. Label every column in plain language, remove the agency's internal codes, and include a row that states the source and date so the file stands on its own if it gets forwarded. A reporter who can understand your file in thirty seconds is a reporter who will use it. For the visual side, the formats reporters actually reuse and the mistakes that get charts spiked are worth studying in our guide to visualising data so journalists can actually use it.

Resist the urge to over-process. Reporters are wary of data that has been massaged into a marketing-friendly shape, and rightly so. Present the cleanest honest version of the finding, flag anything that complicates it, and let the strength of the comparison do the work. A finding that survives an editor's scepticism is worth ten that look impressive but collapse under one phone call.

It also pays to anticipate the obvious counter-question. Whatever your headline number is, a good reporter will immediately ask the natural rival explanation, and if you have not thought of it before they do, you look unprepared. If your story is that one region leads on some measure, be ready to say whether that holds once you account for population, income, or age. If your story is a sharp rise, be ready to say whether last year was simply an unusually low base. Handing the reporter the answer to the question they were about to ask is what turns a tentative "let me look into this" into a confident commission, and it is the kind of preparation that gets you invited back with the next dataset.

Building a public-data habit

The teams who win with open data are the ones who treat it as a routine rather than a rescue mission. Keep a shortlist of the half-dozen sources in your sector that reliably refresh, and check them on a calendar. Most agencies publish a release schedule months in advance, which means you can know the exact date a dataset will drop and have your angle drafted before the file even exists. When a regulator publishes its annual complaints figures or a statistics agency drops a new release, you want to be the team that has already framed the angle while everyone else is still downloading the file. Speed against a fresh public release is one of the most dependable routes to coverage there is, because the reporter covering that release is actively looking for an angle and you are the only one offering a finished one.

Over time you will learn which sources produce stories and which waste your afternoon, and that judgement compounds. Pair your open-data routine with the proprietary patterns inside your own business, an approach covered in the data story hiding in data you already have, and you have two complementary engines: one free and public, one private and defensible. Used together, they mean you are never short of a story, and you are never reliant on a budget to find one. Public data rewards the patient and the curious, and it asks for nothing but the willingness to read carefully and credit honestly.

Need a hand with this?

Pressfold builds data stories that earn coverage on real publications. Tell us what you're working on and we'll reply within one business day.

Get in touch →