Microsoft’s AI research team accidentally exposed 38 terabytes of private data, including sensitive information like secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages while sharing open-source training data on GitHub, according to cloud security company Wiz.
The exposure occurred because the researchers used an Azure feature called Shared Access Signature (SAS) tokens to share their files, but the access level was configured incorrectly. Instead of limiting access to specific files, the link shared the entire storage account, including the additional 38TB of private data, the report said.
Additionally, the token was misconfigured to allow “full control” permissions, thus “not only could an attacker view all the files in the storage account, but they could delete and overwrite existing files as well.”
Wiz reported its findings to Microsoft on June 22, leading Microsoft to revoke the SAS token on June 24.
Microsoft completed its investigation and said that no customer data or other Microsoft services were at risk due to this issue. Furthermore, it said that customers need not take any additional action for security.
“No customer data was exposed, and no other internal services were put at risk because of this issue. No customer action is required in response to this issue,” it said in a statement.
The tech giant explained that the problem stemmed from a Microsoft researcher inadvertently including the SAS token in a public GitHub repository while contributing to open-source AI learning models. Microsoft clarified that there was no security issue or vulnerability within Azure Storage or the SAS token feature.
To prevent such incidents, Microsoft said, it encourages users to create and handle SAS tokens appropriately and follow best practices. It said it is also actively improving its detection and scanning tools to identify cases of over-provisioned SAS URLs and enhance their secure-by-default posture.