Apple Intelligence: the Latest AI privacy challenge
By Alastair Paterson is the CEO and co-founder of Harmonic Security
At its Worldwide Developers Conference in June, Apple announced its long-anticipated move into AI, partnering with OpenAI to bring ChatGPT to iOS 18 which will start rolling out generally to users in September. Apple senior vice president of software engineering Craig Federighi calls its strategy “a brand-new standard for privacy in AI”. Others however, including OpenAI founder Elon Musk were not so sure, threatening to ban the devices from his offices and branding it an ‘unacceptable security violation’.
Musk’s outburst may be an overreaction, but it’s symptomatic of some recent high-profile concerns around AI and data privacy and, in recent months, there’s already been disquiet with data privacy issues surrounding Microsoft Recall, Slack, and Figma.
Also Read: AiThority Interview with Brian Stafford, President and Chief Executive Officer at Diligent
How will user data train OpenAI models? We don’t know.
Apple has sought to reassure users that most of the new AI capabilities will run on device – and this would make business sense since the onus will be on users to upgrade to the latest models capable of running them. However, questions remain about other areas. For instance, Apple states that “when requests are routed to Private Cloud Compute, data is not stored or made accessible to Apple, and is only used to fulfil the user’s requests.” Information on how user data is used to train OpenAI models is conspicuous in its absence. Furthermore, there may be different, competing data privacy preferences when users connect to their ChatGPT accounts. This should give individuals and corporations pause for thought – for now.
Ambiguous AI privacy policies have become the norm
Such ambiguity is what we have come to expect from GenAI-enabled SaaS companies, and this is far from an isolated incident. In just the last few months, DocuSign updated its FAQ to state that “if you have given contractual consent, we may use your data to train DocuSign’s in-house, proprietary AI models”. Then, in May, a similar AI scare had Slack users up in arms with many criticizing vague data policies after it emerged that, by default, their data —including messages, content, and files — was being used to train Slack’s global AI models. Reports remarked that ‘it felt like there was no benefit to opting in for anyone but Slack.’ After conducting our own research we found that such cases are just the tip of the iceberg and even the likes of LinkedIn, X, Pinterest, Grammarly, and Yelp have potentially risky content training declarations.
Pushing data boundaries
At the best of times, navigating data retention and content training declarations is hard work. More often, it’s outright missing or heavily hidden, written in dense legal language, and subject to change over time. Vendors will continue pushing these boundaries as the global arms race to get ahead in AI accelerates. AI models need vast data sets to train them, and it means that vendors will be increasingly tempted to harvest as much data as they can and answer any questions later. This could be in the form of feature updates that, in effect, are a ‘data grab’ and deliver little value to the end user, or it could be in the form of vague policies deliberately designed to confuse which businesses unwittingly sign up to.
Previously it has been possible for organizations to turn off risky features until they have been properly understood, however we may be approaching a situation where these are so baked into the operating system that it becomes difficult to do. Can we (and should we) trust these companies to secure our data, or do we need to shift how we approach data protection?
Also Read: The Risks Threatening Employee Data in an AI-Driven World
Navigating the minefield
Currently, organizations are in a tricky position – useful AI services, correctly implemented, can deliver huge productivity and innovation benefits. Simply blocking all is as risky as enabling all given firms can be left behind their peers. So what should firms do?
Firstly, check the T&Cs – start with the most commonly used applications used in your organization but be aware that checking the text can be complicated as few services are forthcoming with exactly what types of data are used to train their models. The details are often buried and sometimes aren’t referred to at all. Overarching privacy policies can often be the best place to guide decision-making but can change frequently so there is a need to review fairly regularly. Some of these do provide data training opt-outs, especially in the EU, which is another option for security teams.
Focus on the apps you need – the challenge of understanding and tracking these T&Cs is exacerbated by the huge surge in Shadow AI. With such attention to detail required to oversee apps, it’s helpful to have fewer apps to focus on. This means having honest conversations with employees to understand what services they are using and for what purpose to promote transparency and understanding of the risks posed by rogue AI adoption.
Identify Shadow AI – on the technical side, traditional cybersecurity tools like internet gateways and next-generation firewalls can also provide data to manually identify potential shadow AI instances. For companies using identity providers like Google, tracking “Sign-in with Google” activity can reveal unauthorized app usage. Specialized third-party solutions designed specifically to detect both shadow IT and shadow AI, can significantly improve an organization’s ability to pinpoint and mitigate these technological threats.
One thing is for sure – AI only gets better the more data it has and organizations will need to carefully navigate the benefits and drawbacks of the services they use and the information shared with them.
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Comments are closed.