Guide ยท Jun 25, 2026 ยท 10 min read ยท by the Pressfold team
Turning customer data into stories โ safely
Sitting inside almost every business is a quiet competitive advantage that most companies never use: their own data. The purchases, the searches, the support tickets, the booking patterns, the things customers actually do rather than what a survey panel claims they would do. For a data PR studio, this is gold, because behavioural data carries an authority that opinion polling never quite matches. When you can say "based on three million real transactions" rather than "based on a survey of 2,000 adults," journalists lean in.
But customer data is also the single most dangerous source we work with, because behind every row is a real person who never agreed to become a press release. Mishandle it and you do not just lose a story โ you risk a regulatory breach, a loss of customer trust, and reputational damage that dwarfs any coverage you might have won. This piece is about how to mine genuine, newsworthy stories from customer data while staying firmly on the right side of the law and of basic ethics. The two goals are not in tension, but holding both at once takes discipline.
Start with the principle, not the dataset
The instinct when handed a rich dataset is to dive in and look for patterns. Resist it. The first question is not "what can we find" but "what are we actually allowed to do with this." Customer data was collected for a purpose โ fulfilling orders, providing a service, running an account โ and using it for PR is a new purpose that the original collection may not have covered. In many jurisdictions, repurposing personal data for something the customer never anticipated is exactly what data-protection law exists to police.
The governing principle across most modern privacy regimes is that you process personal data fairly, for a legitimate purpose, transparently, and no more than necessary. Before any analysis begins, the honest test is whether a reasonable customer, told plainly how their data was being used, would feel deceived. If the answer is yes, no amount of anonymisation downstream fully fixes the problem, and you should stop and involve whoever owns data governance in the business. Getting this clearance up front is far cheaper than discovering the problem after publication.
This early conversation also sets the data minimisation discipline that protects you later. The temptation is to request the full customer database "just in case," but the more personal data sits in your working files, the larger the surface area for something to go wrong โ a leaked export, a misdirected attachment, a contractor who should never have seen it. The professional habit is to specify only the fields the story could conceivably need, and to ask for an extract that already excludes direct identifiers. If a story can be told from anonymised purchase categories and dates, there is no reason for names and email addresses to ever leave the secure system. Minimising what you hold is the single cheapest insurance policy in the whole workflow, and it costs nothing but the discipline to ask for less.
Aggregation is your safest default
The cleanest way to turn customer data into a story is to never tell a story about an individual at all. Aggregate data โ trends, totals, averages, distributions across thousands or millions of records โ is both more newsworthy and far safer than anything granular. "Bookings to coastal destinations rose 40% this spring" is a strong, publishable line that describes a population, not a person. No single customer is identifiable, and the finding is more interesting precisely because it captures scale.
The discipline of aggregation is to ensure the groups you report on are genuinely large enough to hide the individuals inside them. A statistic about "customers in this one small town who bought this niche product" might technically be aggregate, but if only four people fit the description, you have effectively published their behaviour. A practical rule many practitioners adopt is to suppress any cell or breakdown that falls below a meaningful threshold โ never report a slice so thin that a person, or someone who knows them, could reverse-engineer who is in it.
- Report populations, not people. Every headline figure should describe a group large enough that no individual stands out within it.
- Suppress thin cells. Set a minimum group size below which you do not publish a breakdown, and apply it consistently.
- Round and band. Reporting age bands rather than exact ages, or rounded figures rather than precise ones, reduces the chance of re-identification while costing you nothing journalistically.
- Avoid the unique combination. Watch for cases where several ordinary attributes combine to single someone out โ a job title, a location, and a purchase together can be more identifying than any one alone.
Anonymisation done properly
Anonymisation gets talked about loosely, as if removing names makes data safe. It does not. Stripping the obvious identifiers โ name, email, account number โ produces pseudonymised data, which is still personal data under most laws because it can be linked back with a key or combined with other information. True anonymisation, where re-identification is no longer reasonably possible, is a much higher bar and the only state that genuinely takes data outside the scope of privacy rules.
The realistic risk is not that you publish a customer's name; nobody does that. It is that you publish enough seemingly innocuous detail that someone could piece an identity together, especially by combining your data with other public information. The defences are aggregation, banding, suppression of outliers, and a hard look at quasi-identifiers โ the attributes that are not names but still narrow the field. The single highest-spending customer, the one person in a rare category, the outlier who skews an average: these are exactly the data points a journalist will find most interesting and exactly the ones most likely to identify a real person. Strip or generalise them before the data ever leaves the building.
When you do find a striking outlier, the move is to describe the phenomenon without exposing the person. "One customer's order was large enough to feed a small village" can become "a small number of bulk orders accounted for a surprising share of volume" โ same insight, no individual on display. Often the safer framing is also the better story, because it speaks to a pattern rather than a curiosity, and patterns are what make the data you already hold genuinely valuable for PR.
Consent, transparency and the spirit of the rules
Even with aggregated, anonymised data, the question of consent and transparency does not fully disappear. Customers who handed over their data for one purpose have a reasonable expectation about how it will be used, and using it to generate press coverage can stretch that expectation. The cleanest position is one where your privacy notice openly states that aggregated, anonymised data may be used for research and insight, so nothing comes as a surprise if a customer reads about it.
Where the data is sensitive โ anything touching health, finances, location patterns, children, or other protected categories โ the bar rises sharply and the safest answer is often simply not to use it for PR, regardless of how compelling the story would be. The reputational maths is brutally one-sided: the upside of one good story never outweighs the downside of a customer feeling their private circumstances were turned into marketing. When in doubt, leave it out. No coverage is worth a customer reading about their own vulnerability in a national headline.
Transparency also extends to how you present the data to journalists. Be honest about what the dataset is, where it comes from, and its limits. Your own customer base is not a representative sample of the whole population โ it is the people who chose you โ and a responsible pitch acknowledges that rather than implying universal truth. This honesty protects everyone: the journalist from publishing an overclaim, the client from a credibility hit, and the studio from being the source of a story that later unravels. The same care that makes the data ethical also makes the resulting data study credible enough to withstand scrutiny.
That sampling caveat deserves more weight than it usually gets, because it is where well-meaning customer-data stories most often overreach. A food delivery company's data tells you a great deal about people who order food delivery, and almost nothing reliable about the eating habits of the nation. The honest framing leans into that rather than hiding it: "among our customers" is not a weakness to be buried but a precise and defensible claim. Journalists increasingly know to ask whose data this is, and a pitch that volunteers the limitation reads as confident rather than caveated. Conversely, a behavioural insight dressed up as a universal truth is exactly the kind of overclaim that gets a story quietly corrected a week after publication โ and corrections are remembered far longer than the original coverage.
A practical workflow that keeps you safe
The way to make all of this routine rather than nerve-wracking is to bake it into a repeatable workflow, so the safeguards are not heroic last-minute checks but standard steps. In practice this means a clear sequence that every customer-data story passes through before it ever reaches a journalist.
- Clear the purpose. Confirm with whoever owns data governance that using this data for PR is compatible with how it was collected, before any analysis starts.
- Work on a minimised extract. Pull only the fields you genuinely need, stripped of direct identifiers, rather than analysing the full raw customer database.
- Aggregate and threshold. Build the findings at population level, apply a minimum group size, and band or round wherever precision is not load-bearing.
- Hunt the re-identification risk. Actively look for outliers and quasi-identifier combinations that could expose an individual, and generalise them.
- Final ethics read. Ask the plain question: would a reasonable customer feel betrayed reading this? If yes, change the framing or drop the angle.
It helps to nominate two distinct sign-offs on that workflow rather than relying on a single person to catch everything. One reviewer checks the compliance angle โ purpose, minimisation, thresholds, identifiers โ and the other reads purely as a customer would, with no spreadsheet open, asking only whether the finished story feels fair. These are genuinely different lenses, and the same person rarely holds both well at once: the analyst who built the cut is too close to spot the re-identification risk, and the storyteller chasing the headline is the last person who should be policing it. Separating the roles turns safety from a matter of individual conscience into a process that does not depend on anyone being heroic on a deadline. Documenting that both checks happened also means that if a question ever comes back, you can show the care was deliberate rather than assert it after the fact.
Safe and newsworthy are not opposites
The lesson from running these projects is that the constraints rarely kill the story โ they usually improve it. Aggregation pushes you toward the population-level trends that journalists find most compelling. The discipline of removing outliers forces you to find the pattern rather than the freak case. Honesty about your data's limits builds the credibility that earns repeat coverage rather than one-off hits followed by quiet corrections.
Customer data is one of the most powerful assets a data PR studio can work with, precisely because it reflects real behaviour at scale. Handled carelessly it is a liability that can outlast any campaign; handled with discipline it is a renewable source of authoritative, defensible stories that competitors relying on survey panels simply cannot match. The goal is never to choose between being safe and being newsworthy. It is to build a practice where being safe is what makes you newsworthy in the first place โ and where the customer whose data made the story would, if they read it, have no reason to object.
Need a hand with this?
Pressfold builds data stories that earn coverage on real publications. Tell us what you're working on and we'll reply within one business day.
Get in touch →