Big data is big business. But in an age of digital privacy paranoia, it isn’t always easy for tech companies to get their hands on information – particularly when some of the most potentially beneficial data is also confidential, locked up in healthcare and finance companies who aren’t comfortable sharing.

In healthcare, industry standards like the Health Insurance Portability and Accountability Act (HIPAA) and Health Insurance Technology for Economic Clinical Health (HITECH) Act require companies to protect their patients’ information. The Securities and Exchange Commission’s Privacy of Consumer Financial Information Rule demands the same from financial institutions. So how can data hungry tech giants gain access to this information? 

Google, Microsoft, and a couple of Cornell University professors have begun to develop machine-learning systems intended to bypass this problem by tapping directly into personal data while keeping it anonymous, according to MIT Technology Review.

Healthcare companies maintain unique and identifiable information about their patients; though this aggregated data may help crack the code to treating various pathologies, these companies cannot willingly share it because of the risks of undermining HIPAA and HITECH. Vitaly Shmatikov and Reza Shokri of Cornell University sought a way to combine the potential insights of personal data from a number of different companies within a given industry. 

Shmatikov and Shokri’s “privacy-preserving deep learning” lets organizations train deep-learning algorithms from their own data and only share the key parameters. The researchers hope to refine this Google-backed system to function as efficiently as a system would with access to the entire database at once. Google has tested a similar system, publishing a paper on what they call “model averaging”. The tech giant has trained deep-learning algorithms from data in closed networks, rather than having the information transfer into Google’s cloud.

Microsoft has also looked for a way to train deep-learning algorithms from private information. The company’s “CryptoNets” allow organizations to apply neural networks to their encrypted data. Through a process called ‘homomorphic encryption’, the neural nets supposedly deliver encrypted  answers that keep data confidential. So far, their software has been capable of calculating a patient’s risk of pneumonia by analyzing key vital signs.

Privacy and security are top concerns in our digital age. Add artificial intelligence to the mix, and people start to get uneasy. If researchers can innovate new ways to keep data encrypted and anonymous, companies like Microsoft and Google may be able to unleash machine learning on a wealth of sensitive data.

Image credit: Creative Commons