Edit 11/26/2024 at 7:00 a.m. PT: Microsoftvia Twitter (below), has now stated that the company does not use data to train its large language models (AI models).
In M365 applications, we do not use customer data to train LLMs. This setting only enables features that require Internet access, such as co-authoring a document. https://t.co/o9DGn9QnHbNovember 25, 2024
It’s no secret that Microsoft Office delivers connected experiences that analyze user-created content. However, according to @nixCraftan author of Cyberciti.bizMicrosoft’s Connected Experiences feature automatically collects data from Word and Excel files to train the company’s AI models. This feature is enabled by default, meaning user-generated content is included in AI training unless manually disabled. However, this deactivation is a very complicated process. Microsoft has yet to comment on the information, so take it with a grain of salt (EDIT: as noted above, Microsoft has now stated that this feature does not enable AI).
This default setting allows Microsoft to use materials such as articles, novels, or other works for copyright or commercial purposes without explicit consent. The implications are significant for creators and businesses that rely on Microsoft Office for their proprietary work, as their data could become part of the company’s AI development. For this reason, anyone concerned about protecting their intellectual property or sensitive information should act immediately.
To do this, users must actively opt out by finding and disabling the feature in Settings. The process requires unchecking the “Enable optional connected experiences” box which is enabled by default.
On a Windows PC, the steps are to go to File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and uncheck the box. Seven steps to disable a critical automatically enabled feature seems very complicated.
Microsoft’s approach reflects a general trend in the tech industry, where other companies have introduced similar features to train their AI models. Although all AI models are trained on something generated by humans, doing so without their consent is unethical, to say the least.
Microsoft has not publicly confirmed or denied using the content of Microsoft Office user-generated Excel and Word documents to train its AI models. However, there is a clause in Microsoft’s contract Service contract which grants the company “a worldwide, royalty-free intellectual property license to use your content.”
“To the extent necessary to provide the Services to you and others, to protect you and the Services, and to improve Microsoft products and services, you grant Microsoft a worldwide, royalty-free intellectual property license to use Your Content, through example, make copies of, store, transmit, reformat, display and distribute via communication tools your content on the Services,” the clause states.