GitHub Will Train Copilot on Your Code: Opt Out by April 24, 2026

GitHub Copilot AI training on user code
Your code is about to become training data for Copilot. Did you know?
Matteo 6 min

You know when you accept the Terms of Service without reading them? Yes, we all do it. But this time it is worth making an exception, because GitHub has just changed the rules of the game — and if you do nothing before April 24, your code becomes training material for Copilot.

Not your entire repository in one go, no. Something more subtle.

What Is Happening, in Plain English

Starting April 24, 2026, GitHub will begin collecting data generated during interactions with Copilot to train its AI models. If you are a Free, Pro, or Pro+ user and you do nothing, you are in. Automatically. Without anyone asking for your permission.

We are not talking about a massive download of your private repositories. We are talking about something more nuanced: every time you use Copilot, GitHub will collect the inputs you send, the outputs you accept or edit, the code context around your cursor, comments, file names, and the structure of your repository.

In practice, every Copilot work session becomes a lesson for the model. And you are the unpaid teacher.

Business and Enterprise users are excluded. Of course they are. Because the people who pay more get the privilege of not being milked.

The Opt-Out Problem

And here comes the fun part. Or the ugly part, depending on your perspective.

The mechanism GitHub chose is opt-out. They do not ask, “Do you want to participate?” They say, “You are already participating, and if you want out, good luck.”

It is like your local supermarket sending you an email saying: “Starting tomorrow, we will use your shopping habits for targeted advertising. If you are not happy about it, go to the end of aisle seven, turn left, find the hidden door behind the detergent shelf, and fill out the form. You have 30 days.”

In Europe, this approach has a technical name: GDPR violation. European regulation requires explicit and informed consent. Opt-out, by definition, is not consent. It is inertia. It is betting that people will not read the email.

To opt out, go to github.com/settings/copilot/features and disable the option that allows GitHub to use your data for training. It takes 30 seconds. But how many of GitHub’s 100 million developers will actually do it?

The Private Repository Paradox

There is one detail that should give chills to anyone working on proprietary code. If even one single collaborator on your project uses Copilot inside your private repository, the code Copilot sees during that session may end up in the training dataset.

Read that again. It does not matter whether you opted out. If your colleague did not, your code is still exposed.

And it is not just about code. Commits include metadata: names, email addresses, timestamps. Code comments may contain references to internal infrastructure, forgotten API keys, or customer names. GitHub says it does not collect “inactive” data, but the definition of what counts as “inactive” is — let us say — flexible.

The Community Reaction Is Not Great

On Hacker News, the discussion exploded. And the dominant sentiment is not disappointment — it is anger.

Some developers are migrating hundreds of repositories to self-hosted alternatives such as Forgejo and Gitea. Others suggest deliberately poisoning datasets by uploading faulty code. Others are simply deleting their accounts.

The trust fracture is real. And it is not the first time. Microsoft has a long history of broken promises when it comes to user data. When Microsoft acquired GitHub in 2018, it came with reassuring words about the platform’s independence. Eight years later, here we are.

The Licensing Question

This is the part that hurts the most if you write open source code under restrictive licenses.

If your project is licensed under GPL, derivative code must keep the same license. But if an AI model is trained on GPL code, is its output “derivative”? The legal answer is: nobody knows yet. There are no clear case law precedents. Meanwhile, GitHub keeps training.

If your project uses an MIT or Apache license, the discussion is different but not less problematic. Those licenses allow code reuse, yes — but they also require attribution. When Copilot suggests a function that is essentially a copy of your code, who gets credited? Nobody.

What You Can Do Right Now

If this concerns you, here are the actions to take before April 24:

1. Disable training on your data. Go to github.com/settings/copilot/features and turn off the training option.

2. Check your collaborators. If you work on private repositories with other developers, make sure they opt out as well. It only takes one weak link.

3. Evaluate alternatives. If this is a matter of principle, consider platforms such as GitLab, Forgejo, Gitea, or self-hosted solutions. Migration is not painless, but it is doable.

4. Revisit your licenses. If your code is open source, consider adding explicit clauses about AI training usage. Some newer licenses, such as the AI2 Impact License, address this topic directly.

The Bigger Picture

The truth is that GitHub’s move is not isolated. It is part of a broader trend in which platforms hosting user-generated content start monetizing that content through AI.

Reddit sold its data to Google for 60 million dollars. Stack Overflow made similar deals. Now GitHub is doing the same, but with one crucial difference: source code is not a Reddit comment. It is intellectual property. It is the product of the work of millions of people.

The pattern is always the same. A platform grows thanks to user content, becomes dominant, and then changes the rules to monetize that content. Reddit sold its community’s data to Google for 60 million dollars. Stack Overflow did the same. Now it is GitHub’s turn.

But there is a fundamental difference: source code is not a Reddit comment. It is intellectual property. It is the product of hours, months, and years of work. And the opt-out mechanism — “you are in unless you notice that you need to get out” — is not consent. It is exploited inertia.

If GitHub wants to use interaction data to improve Copilot, it can. But it should ask, not assume. It should be opt-in, not opt-out. And it should remember that without the 100 million developers writing code on its platform every day, Copilot would have nothing to train on.

April 24 is less than a month away. Go into your settings and decide what happens to your code — before someone else decides for you.

Sources and References

Copiato