Data sharing service Dropbox has faced a firestorm of controversy when a Northwestern University released a study last week analyzing the data of 40,000 of the company’s users enrolled at or employed by 1000 universities.

Designed to analyze how successful science teams collaborate, the study has become an object lesson on the way users lose control of their data once it’s handed over to cloud-based services.

Both the researchers and Dropbox have attempted to rephrase and walk back several claims made regarding the usage of data. The article initially stated that Dropbox gave the researchers access to folder-related data collected over two years, which the researchers anonymized. That soon was changed to something more specific: “access to project-folder-related data, which Dropbox had aggregated and anonymized… This included information on a user’s total number of shared folders, folder structure, and shared folder access, but we and Dropbox employees could view no personally identifiable information.”

Dropbox moved to correct the record in an email to ZDNet stating that “[t]he article contained factual errors,” and that Dropbox had rendered “any identifying information unreadable, including individual emails and shared folder IDs.”

The company also claims the data sharing was discussed in their privacy policy, including, “How you use the Services, including actions you take in your account (like sharing, editing, viewing, and moving files or folders).”

The privacy policy also states that the company uses third parties to help “provide, improve, protect, and promote our Services.”

Whether or not the data anonymization was performed before or after Dropbox shared it, the data itself apparently still contained the user’s folder structure, institutional affiliation and researchers’ seniority, any of which could potentially be used to identify users, to the widespread consternation of academics and privacy advocates alike.  

The initial study has since been revised to “reflect that 1,000 university departments were represented, not 1,000 universities,” and that the final dataset included 16,000 research projects instead of the initially reported 40,000.

Read more about the story here.