Cybersecurity, explained for the rest of us.

General

Data brokers: the invisible industry that knows everything about you

Margot 'Magic' Thorne@magicthorneJune 10, 202612 min read
Abstract visualization of data flowing from individual profiles through broker networks to purchasing companies

You've never heard of Acxiom, Epsilon, or Oracle Data Cloud. You've never visited their websites, created accounts, or agreed to their terms. You don't receive bills from them. You can't log in to check what they know about you.

They know you anyway.

Data brokers are the invisible infrastructure of modern commerce. They collect information about you from hundreds of sources, compile it into detailed profiles, and sell access to anyone willing to pay. The industry operates mostly out of sight, connecting what you buy at Target to what you search on Google to where you live and how much you earn.

Here's how the machinery actually works.

The collection mechanism

Data brokers don't need your permission to collect information about you. They pull from sources that fall into four categories, each feeding a different part of the profile.

Public records form the foundation. Property deeds, voter registrations, court filings, professional licenses, marriage certificates, and bankruptcy records are all public information. Data brokers scrape these databases continuously. When you buy a house, register to vote, or renew a professional license, that information enters the broker ecosystem within weeks.

Purchase data comes from retailers, credit card companies, and loyalty programs. When you swipe a card or scan a rewards barcode, that transaction data often gets sold to data brokers. The FTC has documented how retailers sell aggregated purchase histories to third parties. The broker doesn't see your credit card number, but they see what you bought, when, where, and roughly how much you spent.

Online behavior tracking happens through cookies, pixels, and device fingerprints. Websites embed tracking code from data brokers and their partners. When you visit a site, that code logs your visit, the pages you viewed, how long you stayed, and what you clicked. The EFF's Privacy Badger documentation shows how pervasive this tracking has become. Over time, these fragments connect into browsing profiles that follow you across devices.

App data sales are less visible but equally significant. Free apps often monetize by selling user data to brokers. Location data, contact lists, usage patterns, and in-app behavior all get packaged and sold. You agreed to this in the terms of service you didn't read. The app developer gets paid. The broker gets data. You get the app for free.

Data brokers don't collect this information themselves in most cases. They buy it from intermediaries who specialize in specific data types. The broker's job is aggregation and linkage.

The matching process

Raw data is useless without identity resolution. Data brokers need to connect the person who bought dog food at Petco to the person who searched for "best dry dog food" and the person who lives at 742 Evergreen Terrace. This is the technical problem the industry has spent billions solving.

Deterministic matching uses exact identifiers. Email addresses, phone numbers, physical addresses, and device IDs serve as anchors. When two data sources contain the same email address, the broker merges those records with high confidence. This is straightforward when the identifiers match perfectly.

Probabilistic matching handles cases where identifiers don't align exactly. Algorithms compare patterns across data points. A record with the same first name, last name, ZIP code, and approximate age likely represents the same person even if the street address differs slightly. The broker assigns a confidence score and merges the records if the score exceeds a threshold.

Device graphs track you across phones, laptops, tablets, and smart TVs. When you log into the same account on multiple devices, the broker links those devices to your identity. When devices share the same WiFi network or IP address, the broker infers household relationships. Over time, the graph maps which devices belong to which people and which people live together.

The matching process isn't perfect. Errors happen. Common names create false positives. Address changes create gaps. But the scale compensates for individual errors. With enough data points, the profile converges toward accuracy.

What the profiles contain

A complete data broker profile isn't a single file. It's a collection of attributes stored across multiple databases, linked by identifiers. Different brokers specialize in different data types, but the core categories overlap.

Demographic data includes age, gender, ethnicity, education level, occupation, income range, marital status, and household composition. Some of this comes from public records. Some comes from statistical inference based on ZIP code and purchase behavior.

Contact information covers current and historical addresses, phone numbers, email addresses, and social media handles. Data brokers track your address history going back years or decades. Moving doesn't erase the old records.

Financial indicators estimate your income, net worth, credit behavior, and spending capacity. Brokers don't see your actual credit score or bank balance, but they infer financial status from property values, purchase patterns, and neighborhood demographics.

Purchase history logs what you buy, where you buy it, and how often. Categories include groceries, clothing, electronics, travel, entertainment, and subscriptions. The data is aggregated, not itemized, but detailed enough to reveal preferences and habits.

Online behavior tracks the websites you visit, the content you consume, the searches you perform, and the ads you click. This data connects to advertising profiles used for targeted marketing.

Health and lifestyle interests come from purchases, searches, and app usage. If you buy diabetes test strips, search for "low-carb recipes," and use a fitness tracker, the broker infers health conditions and wellness interests. This isn't medical data in the legal sense, but it reveals sensitive information.

Political and social attributes include voter registration, donation history, issue interests, and predicted political leanings. Data brokers build profiles of likely voters, donors, and activists for campaign targeting.

The profiles aren't static. Brokers update them continuously as new data arrives. Your profile today differs from your profile six months ago.

Who buys the data

Data brokers don't sell to individuals. They sell to organizations with specific use cases. The buyers fall into categories defined by how they use the data.

Advertisers are the largest customer segment. They buy audience segments for targeted ads. A car manufacturer buys access to people aged 35-50 with household incomes over $75,000 who recently searched for SUVs. The advertiser never sees your name or address. They see an anonymized audience ID that allows them to show you ads across websites and apps.

Marketers buy data for direct mail, email campaigns, and telemarketing. They want contact information attached to demographic and behavioral attributes. A credit card company buys a list of people with good credit who recently moved. A retailer buys email addresses of people who shop at competitors.

Employers use data brokers for background checks and candidate screening. The data supplements traditional background checks with social media activity, online behavior, and inferred attributes. Some employers use this for hiring decisions. Some use it for monitoring current employees.

Insurers buy data to assess risk and set premiums. Health insurers want to know if you buy cigarettes or visit fitness websites. Auto insurers want to know if you speed or drive long distances. The data doesn't replace underwriting, but it informs pricing models.

Financial institutions use broker data for credit decisions, fraud detection, and marketing. Banks buy data to pre-screen credit card offers. Lenders use it to verify income and employment. Fraud teams use it to flag suspicious applications.

Political campaigns buy voter data for targeting ads, fundraising appeals, and get-out-the-vote efforts. They want to know who votes, who donates, and what issues matter to specific demographics.

Government agencies buy data for investigations, surveillance, and enforcement. Law enforcement agencies purchase location data, social network graphs, and behavioral profiles. Immigration enforcement buys data to locate individuals. The EPIC documentation shows how government use of commercial data has expanded without clear legal boundaries.

The buyers don't get your full profile. They get slices relevant to their use case. But the slices add up.

The legal landscape

Data brokering operates in a regulatory gap. No federal law prohibits the collection and sale of personal information by third parties. The industry grew faster than the legal framework designed to constrain it.

The Fair Credit Reporting Act regulates consumer reporting agencies, but most data brokers don't meet the legal definition. If they're not providing reports for credit, employment, insurance, or housing decisions, they're not covered. Advertising and marketing uses fall outside the law's scope.

State privacy laws have started to fill the gap. California's CCPA and CPRA give residents the right to know what data brokers hold, request deletion, and opt out of sales. Vermont requires data brokers to register with the state. Colorado, Virginia, and Connecticut have similar laws. But enforcement is inconsistent, and most states have no data broker regulations at all.

Sector-specific laws cover narrow slices. HIPAA protects medical records held by healthcare providers and insurers, but not health-related data inferred from purchases and searches. COPPA protects children under 13, but not teenagers. FERPA protects education records, but not data about students collected outside school systems.

The FTC has enforcement authority over unfair and deceptive practices, but that requires proving harm or deception. Simply collecting and selling data isn't enough. The FTC has brought cases against brokers who misrepresent their practices or fail to secure sensitive data, but those cases are exceptions.

Most data brokering is legal because no law prohibits it.

The scale problem

Understanding data brokers means understanding scale. Individual collection events feel trivial. Aggregation across millions of people and billions of data points creates something different.

Acxiom claims to maintain profiles on around 700 million people globally, with hundreds of data points per profile. Epsilon processes billions of transactions annually. Oracle Data Cloud tracks online behavior across millions of websites. These aren't small operations. They're infrastructure.

The Verizon Data Breach Investigations Report documents how breaches at data brokers expose millions of records at once. When a broker gets breached, the attackers don't just get your email address. They get your profile, your household composition, your purchase history, and your inferred attributes. The 2017 Equifax breach exposed data on 147 million people. Equifax is a credit bureau, but the mechanism is the same.

Scale also means persistence. Data doesn't disappear when you delete an account or close a browser. It persists in broker databases, gets resold to other brokers, and reappears in new contexts. Opting out of one broker doesn't remove you from the ecosystem. It removes you from that broker's active marketing lists until the next data refresh.

What you can actually control

You can't stop data brokers from existing. You can reduce your exposure and make their jobs harder.

Opt out of individual brokers by visiting their websites and submitting removal requests. The FTC maintains guidance on how to find and contact major brokers. This is labor-intensive. You'll need to contact dozens of brokers separately. Each has its own process. Some require identity verification. Some charge fees. Most will re-add your data within months as they refresh their databases.

Use automated removal services like Incogni to handle the opt-out process continuously. These services submit removal requests on your behalf and monitor for reappearance. They cost around $10-15 per month. They don't remove you from every broker, but they cover the largest ones and handle the ongoing maintenance.

Freeze your credit at Equifax, Experian, and TransUnion to block brokers from selling your credit header data for marketing purposes. Credit freezes are free and permanent until you lift them. The FTC explains the process in detail.

Limit data creation by reducing what you share. Use privacy-focused browsers, block trackers, avoid loyalty programs that sell data, pay cash when practical, and read app permissions before installing. This doesn't stop brokers from accessing public records or purchase data, but it reduces the online behavior component.

Use state privacy rights if you live in California, Colorado, Virginia, Connecticut, or another state with data broker laws. Submit requests to know what data brokers hold and request deletion. The process is slow and incomplete, but it's legally enforceable.

Understand the limits of what control means. You can't erase your public records. You can't prevent retailers from selling aggregated purchase data. You can't stop websites from embedding tracking pixels. You can reduce your profile's completeness and make it harder to link data across sources, but you can't disappear from the ecosystem entirely.

The Office connection

In The Office, Dwight Schrute maintains detailed files on every employee at Dunder Mifflin. He knows their birthdays, their allergies, their family structures, their weaknesses. He updates the files constantly. When someone asks how he knows something obscure about them, he taps his filing cabinet and says "I know things."

Data brokers are Dwight's filing cabinet at industrial scale. They don't need to interact with you to know about you. They compile fragments from dozens of sources, connect them through matching algorithms, and sell access to anyone who asks. The files update automatically. The surveillance is passive. The knowledge accumulates whether you notice or not.

The difference is that Dwight's files stay in Scranton. Data broker profiles circulate through a global marketplace where hundreds of companies buy, sell, and trade access to information about you. And unlike Dwight, they're not doing it to win Employee of the Month. They're doing it because it's profitable.

Why this matters now

Data brokers have existed for decades. What changed is the volume of data they collect, the precision of their matching, and the breadth of their customer base. Twenty years ago, data brokers sold mailing lists. Today they sell behavioral profiles that predict your health conditions, political views, and financial vulnerabilities.

The machinery runs continuously. Every purchase, every search, every app install feeds the system. The profiles grow more detailed. The matching gets more accurate. The uses expand into areas that feel intrusive even when they're legal.

You can't opt out of the modern economy. You can understand how the machinery works, reduce what you feed into it, and use the limited control mechanisms available. That's not perfect. It's what's possible.

Flowchart showing paths from data collection through broker networks to consumer control mechanisms
→ Filed under
data brokersprivacypersonal dataidentity datadata collectionconsumer rights
ShareXLinkedInFacebook

Frequently asked questions

A data broker is a company that collects, aggregates, and sells personal information about you without direct interaction. They compile data from public records, purchase histories, online behavior, and other sources, then package it for sale to advertisers, employers, insurers, and others.
Data brokers pull from public records like property deeds and court filings, purchase data from retailers and credit card companies, track your online behavior through cookies and pixels, and buy data from apps that sell user information. They combine these sources to build detailed profiles.
Data brokers typically hold your name, address history, phone numbers, email addresses, age, income estimates, education level, occupation, purchase history, web browsing patterns, health interests, political leanings, and relationship status. Some profiles contain hundreds of data points.
Yes, in most of the United States. Federal law doesn't prohibit data brokering. Some states like California, Vermont, and Colorado have passed laws requiring brokers to register or allow opt-outs, but the industry operates largely unregulated.
You can request removal from individual brokers, but it's labor-intensive and temporary. Most brokers offer opt-out processes, but you must contact each one separately, and data often reappears as brokers refresh their databases from new sources. Automated removal services exist but require ongoing subscriptions.

You might also like