Your Photos Are Training AI — Often Without Your Consent

In April 2026, the FTC concluded an enforcement action against a dating app that had shared nearly three million user photos with AI researchers — without any formal agreement, without restrictions on use, and without notifying users or giving them an opt-out. Both the photos and the AI model built from them were eventually deleted, but not before years had passed.

This story received moderate coverage, then faded. It shouldn’t have. It’s a precise description of how AI training on personal data actually works: quietly, at scale, through contractual ambiguity, long before enforcement catches up.

If your photos live on a platform that has ever expressed interest in AI, you should understand what that interest might mean for your personal archive.

The Scale of the Problem

The dating app case is notable not because it’s unusual, but because it was documented. For every case that reaches FTC enforcement, there are many more that proceed without scrutiny — because the practice is technically permitted by broad terms of service, because users don’t know to look for it, and because the harms are diffuse and invisible.

Consider what “using your photos to train AI” actually means at a systemic level.

A large photo repository — like the kind that lives in a social media platform or a cloud storage service — is among the most valuable training datasets in existence. It contains images across an enormous range of subjects, lighting conditions, scenes, and contexts, all helpfully labelled with location data, timestamps, and social graphs. For computer vision researchers, this is extraordinary material.

The companies holding this data face a straightforward economic calculation: the cost of using it is low (it’s already on their servers), the benefit is high (better models = better products = higher valuations), and the legal risk is — as the OkCupid case demonstrates — often manageable or far in the future.

Your photos are not primarily a liability for these companies. They are an asset.

How Terms of Service Enable AI Training

Most cloud storage and social media terms of service include a broad licence that grants the platform the right to use your content for purposes including “improving services,” “developing new technologies,” and “research and development.”

These phrases are doing a lot of work.

“Improving services” can mean training recommendation systems on your photo library. “Developing new technologies” can mean building computer vision models using your face data. “Research and development” can mean licensing your images to affiliated partners for model training.

None of these uses are spelled out explicitly in the language that matters to most users — the heading, the summary, the consent checkbox. They live in subordinate clauses in documents designed not to be read.

This is not an accident. Platform legal teams are sophisticated and intentional. The ambiguity is a feature.

The Faces Problem

Of all the data your photos contain, face data is the most sensitive category.

Modern face recognition doesn’t just identify who you are in a given photo — it builds a biometric model of your face geometry that can then be used to identify you in contexts you’ve never consented to: surveillance footage, other platforms, physical spaces with camera infrastructure.

Every time your face appears in a photo stored on a platform that trains its own models, you’re potentially contributing to the biometric training data of a system whose future uses you cannot predict or consent to.

This isn’t hypothetical. Clearview AI famously scraped billions of public photos to build a face recognition database used by law enforcement agencies in multiple countries. The photos were technically public. The consent was never obtained. The legal frameworks that might have blocked this use hadn’t been written yet.

Clearview was an extreme case. The general pattern — face data collected for one purpose, used for another — is routine.

What “Opt Out” Usually Means

Many platforms have introduced AI training opt-outs following regulatory pressure and public criticism. These are better than nothing. They are often not what they appear.

Common limitations on AI training opt-outs:

They’re not retroactive. Opting out today does not affect how your photos have already been used. If a model was trained on your data before you opted out, that model doesn’t change. Your face is already in the training set.

They apply to specific uses, not all uses. A platform’s opt-out for “generative AI training” may not cover computer vision training, recommendation model training, or training conducted by affiliated entities under separate data-sharing agreements.

They require users to find them. Opt-out settings are rarely prominent. They typically live deep in privacy dashboards, described in technical language, with defaults favouring the platform. The people least likely to find them are the people who most need to.

They depend on the platform’s integrity. An opt-out is a promise, not a technical constraint. You’re trusting the platform to honour it — a level of trust that becomes harder to justify after cases like OkCupid’s.

AI Training vs. AI Features

It’s worth drawing a distinction that the industry sometimes blurs: AI training and AI features are different things.

AI features — search, auto-tagging, smart albums, transcription — run on your data to produce results that benefit you. You ask the system to find a photo; it uses AI to find it. The processing is in your service.

AI training — using your photos to improve a model that then serves all users, or gets licensed to third parties, or becomes part of the company’s broader AI infrastructure — is a different transaction. Your data is the input; someone else’s capability is the output. You pay with your privacy; the platform (and potentially its partners) captures the value.

A platform that uses your photos for AI features is doing something you’ve probably consented to in a meaningful sense. A platform that uses your photos to train models for its own or third-party benefit is doing something categorically different — and should require explicit, informed consent, not a buried clause.

The Supply Chain Problem

The OkCupid case also illustrates a supply-chain dimension that’s easy to overlook.

The photos weren’t used by OkCupid itself for its own model training. They were passed to academic and commercial researchers with no formal restrictions on what could be done with them. This is how personal data — your photos — ends up in places you’d never predict: research papers, open-source datasets, model checkpoints shared across the AI community.

Once your photos are in a training pipeline, they don’t stay in one place. They propagate. A model trained on your images can be fine-tuned by other researchers, forked into derivative models, incorporated into commercial products. The original photo may never appear anywhere recognisable, but its contribution to the model’s capabilities persists indefinitely.

This is why “you can delete your account” is an incomplete privacy protection: your data may have already entered pipelines that are functionally irreversible.

What Actual Protection Looks Like

Given all of this, what does a storage provider need to commit to — credibly — for your photos not to become training data?

A clear, categorical prohibition on third-party AI training. Not “we don’t sell your data” (which can coexist with sharing it through partnerships). Not “we take privacy seriously.” A specific statement that your content is never used to train AI models for third parties, full stop.

A business model that doesn’t depend on extracting value from your data. If a platform’s revenue comes from knowing more about you — through better ad targeting, through licensing insights, through selling model capabilities built from your content — it has a structural incentive to use your data even if it occasionally gets caught. A subscription model removes that incentive.

Encryption as a floor. Server-side encryption at rest doesn’t prevent a determined provider from using your data — they control the decryption — but it eliminates casual access and signals a more careful operational culture.

No ambiguity in the terms. If the terms of service require a lawyer to interpret, assume they’re written to maximise the platform’s flexibility at your expense.

How daftei Approaches This

daftei does not train third-party AI models on your content. This is a categorical commitment: your photos, voice notes, and documents are never used to train AI systems that benefit anyone other than you.

The AI features inside daftei — search, organisation, memory assistance — operate on your data to produce results for you. That processing happens within a system designed around your interests, not an advertiser’s or a researcher’s.

Your content is stored encrypted at rest with AES-256 and in transit with TLS 1.3. daftei is GDPR and CCPA compliant. We don’t run advertising, we don’t sell data, and we don’t have a business model that makes your content’s value to AI training pipelines worth pursuing.

This is not a marketing position. It’s an operational constraint. A company that doesn’t sell data and doesn’t run AI training pipelines for third parties simply cannot be caught violating that commitment the way OkCupid was — because it’s not a temptation in the first place.

The Question That Matters

The OkCupid case will likely not be the last enforcement action of its kind. As AI training demand increases and enforcement capacity slowly catches up, more cases will surface — more photos, more faces, more private moments that ended up in datasets without the knowledge or consent of the people in them.

The right question isn’t whether any given platform will get caught. It’s whether the platform has any incentive to run the risk in the first place.

If the product is free, the revenue comes from somewhere. If the terms of service are broad and ambiguous, that’s a design choice. If there’s no clear, categorical statement about AI training, that gap probably means something.

Your photos are a significant asset. The question is who they’re an asset to.