Risks of Noncompliance and Challenges around Privacy-Preserving Techniques
Despite the promising signs of AI used in healthcare that we explored in Part 1 of this series, ethical concerns persist regarding the potential misuse of these innovations and safeguarding health data. For instance, drug discovery AI systems have demonstrated remarkable efficiency aiding harmful discoveries as well, identifying 40,000 potentially lethal molecules and the most potent nerve agents in just six hours.
The first part of our blog series on safeguarding health data used for machine-learning set the stage by looking at the different sources of health data and the risks associated with noncompliance with important privacy regulations. In this second part, we cover the risks of noncompliance with data protection laws and regulations, potential misuses of health data facilitated by AI, as well as the challenges around privacy-preserving techniques. Part 3 dives into various attacks launched against AI models and the data used for their training.
Risks associated with Noncompliance
In order for AI to be accurate and helpful, the current wisdom is that vast amounts of data are required to train the model on relevant examples from which it can learn. Data protection laws such as the General Data Protection Regulation (GDPR) in the EU and the Health Insurance Portability and Accountability Act (HIPAA) in the US, to name two popular examples, erect notable hurdles for AI developers to collect and utilize the required data for training purposes insofar as they contain personally identifiable data (PII) or protected health information (PHI).
Noncompliance could lead to significant fines for businesses. For example, while Google DeepMind recently got a class action lawsuit dismissed where the UK data protection authority had found that DeepMind’s partner lacked a legal basis for providing it with patient data to develop an app for detecting acute kidney disease, the lawsuit was dismissed not because the data protection concern was invalid, but because a class action lawsuit requirement was not fulfilled.
Such hurdles do not exist everywhere though. The EU’s Collective Redress Directive came into force in 2023 and facilitates class action lawsuits such as the one against Google.
In addition to noncompliance costs to businesses, the World Health Organization (WHO) explains in its AI Ethics and Governance Guidance for Large Multi-Modal Models operating in the Health Sector that irresponsible data handling practices could lead to eroding trust in the healthcare system with most undesirable consequences for the industry as well as individuals. Many more risks are set out in this Guidance.
Potential misuses of health data facilitated by AI
There are many risks surrounding the use of AI in healthcare, as the WHO set out comprehensively in its Guidance. We provide a brief list of the key risks here:
- Unequal access to healthcare, insurance, employment, social services, funds, etc. as a result of biases contained in AI systems
- System-wide biases can become more prevalent through the broad reliance on AI systems
- Inaccuracy of health-related diagnosis and treatment advice
- Cybersecurity risks could undermine trust and broadly impact the healthcare sector by rendering key systems unavailable
- Overestimation and overreliance on AI leading to skill degradation
- Privacy concerns
When looking at the risks that arise from the use of AI in healthcare, it is important to assess them in comparison to the risks that would exist even in the absence of this new technology. Otherwise a distorted picture would emerge exaggerating the negative impact of AI.
With regard to privacy, many of the concerns are not new. However, AI may exacerbate existing risks related to safeguarding health data privacy and security. The integration of AI systems into healthcare workflows increases the volume, velocity, and variety of data processed, creating new opportunities for unauthorized access, data breaches, and privacy violations such as disregard of consent and transparency requirements and data subject access and deletion rights. Without adequate safeguards, such as robust encryption, access controls, and data governance frameworks, AI can amplify the risks of data misuse, potentially compromising patient privacy and confidentiality.
Challenges around Privacy-Preserving Techniques
To combat the privacy risks we encountered in the previous section, various techniques are being developed. However, the deployment of privacy-preserving machine learning (PPML) techniques, though crucial, encounters several challenges. This section addresses adaptability, scalability, transparency, accuracy and AI ethics trade-offs, data integrity, and a possible trade-off between privacy and utility in the context of AI development.
Adaptability: One Size Doesn’t Fit All
One major hurdle is that privacy-preserving methods are often designed with specific AI algorithms in mind, making them hard to apply universally. As AI technology grows and new algorithms emerge, there’s a continuous need to develop privacy approaches that can keep up. While techniques like local differential privacy have shown promise, their application is limited by this need for customization.
Balancing Efficiency and Privacy
As AI models handle larger data sets, the computational demand of privacy techniques can become a bottleneck. Methods like homomorphic encryption, which offer strong data protection, also require significant processing power. Finding scalable solutions that ensure privacy without compromising processing efficiency is a critical challenge in AI applications.
The Accuracy-Ethics Trade-off
Ethical considerations in AI, such as fairness and transparency, often conflict with the goal of achieving the highest accuracy.
AI algorithms learn from historical data. If this data contains biases, the AI models will likely replicate or even amplify these biases. Striving for high accuracy without addressing these biases means the models may perform well according to their training data but do so unfairly by discriminating against certain groups. Ethical considerations require actively correcting for these biases, which might reduce the model’s performance on biased historical data but increase fairness. Recent research has shown that de-biased data can also increase accuracy.
Highly accurate AI models, especially those using complex techniques like deep learning, can be difficult to interpret. While these models might achieve high accuracy, their “black box” nature makes it hard to understand how decisions are made, conflicting with the ethical requirement for transparency. In contrast, simpler models might offer less accuracy but are more transparent and understandable, making it easier to ensure they’re making decisions for the right reasons.
In fields like healthcare, where patient data is sensitive, this trade-off is particularly pronounced. Ensuring AI decisions are both ethically sound and effective requires careful navigation of these ethical dilemmas.
Guarding Against Tampering
The reliability of decisions made by AI systems is fundamentally linked to the integrity of their data. It’s vital to shield data against unauthorized alterations or poisoning to ensure that the information remains true to its original form. This challenge underscores the importance of implementing rigorous protections that prevent any unauthorized changes to data, thereby preserving its accuracy and trustworthiness for decision-making processes.
Privacy vs. Utility
A key challenge in privacy-preserving AI is balancing the need to protect user data with the desire to maintain the usefulness of that data. Developing strategies that minimize data exposure while preserving the value of the information is crucial for the ethical use of AI. For example, studies on human mobility patterns using location data from smartphones or GPS devices require balancing privacy with the granularity of data needed for accurate analysis. Aggregating data to protect individual locations reduces the risk of re-identification but can also smooth out important patterns or anomalies in the mobility data, impacting the study’s outcomes.
Conclusion
As we conclude this part of our series on safeguarding health data in AI, it’s evident that the intersection of AI’s potential with privacy concerns presents complex challenges, from a regulatory compliance perspective and as a result of the technology itself. The exploration into these challenges underlines the importance of transparency, robust ethical practices, and the development of scalable, effective privacy measures. Moving forward, addressing these issues will be critical in nurturing trust and maximizing the benefits of AI in healthcare. Our journey into understanding the full scope of these concerns and the solutions to them continues in Part 3.