British media report: Google and Microsoft guard against hackers' 'indirect hint injection attack'.-bincial

British media report: Google and Microsoft guard against hackers' 'indirect hint injection attack'.

Posted Time: 2025 November 6 17:09

473

Reference Information

According to a report on the website of the Financial Times on November 2, top artificial intelligence (AI) organizations worldwide are intensifying their efforts to overcome a critical security vulnerability in large language models that could be ex

Google's DeepMind, Anthropic, OpenAI, and Microsoft are among the companies that are trying to prevent so-called indirect prompt injection attacks. In such attacks, third parties hide instructions in websites or emails to trick AI models into reveali

Jacob Klein, head of the threat intelligence team at AI startup Anthropic, said, "Online attackers are using AI to attack every part of the chain."

AI institutions are adopting various methods, including hiring external testers and using AI-driven tools, to detect and reduce the malicious use of their powerful technologies. However, experts warn that the industry has yet to address the issue of

Part of the reason is that large language models are designed to follow instructions, and currently unable to distinguish between legitimate user instructions and inputs that should not be trusted. This is also the reason why AI models are prone to j

Klein said that Anthropic worked with external testers to improve its Claude model's resistance to indirect prompt injection attacks. They also equipped AI tools to detect potential situations where these attacks could occur.

He also said, 'When we detect malicious use behavior, we will automatically trigger certain intervention measures based on credibility, or submit it for manual review.'

Researchers within Google's DeepMind unit will continually attack the company's Gemini AI model in a real-world scenario to identify potential security vulnerabilities.

In May this year, the UK National Cyber Security Centre warned that the threat posed by such vulnerabilities is increasing, and it could expose millions of businesses and individuals using large language models and chatbots to complex phishing attack

Another major flaw in large language models is that external parties can create backdoors and plant malicious content in the data used for AI training, leading to abnormal behavior in the models.

A new study published last month by Anthropic, the UK's Institute for AI Safety, and the Alan Turing Institute revealed that the implementation of these alleged "data poisoning attacks" is less challenging than previously believed by scientists.

Although these flaws pose significant risks, experts believe that AI is also helping companies enhance their ability to ward off cyber attacks.

Ann Johnson, the vice president and deputy chief information security officer at Microsoft, said that for years, attackers have had the advantage because they only need to find one weakness, while defenders must protect in all directions.

She said, 'The defense system is learning and adapting faster, and shifting from passive to active.'

Behind the competition among organizations to overcome the flaws of AI models, cybersecurity has become one of the top concerns for enterprises seeking to apply AI tools to their businesses.

Experts who study cyberattacks say that the development of AI in recent years has fueled a multibillion-dollar cybercrime industry. It has provided cheap malware-writing tools for amateur hackers and helped professional criminals better automate and

Jack Moore, a global cybersecurity advisor at ESET, a cybersecurity company, said, "Large language models enable hackers to quickly generate new malware that has not yet been detected, which increases the difficulty of defense."

MIT researchers have recently found that 80% of the ransomware attacks they investigated used AI.

In 2024, phishing fraud and deepfake fraud related to AI technology increased by 60%.

Hackers also use AI tools to collect information about victims online. Large language models can efficiently search personal data and images on public accounts on the internet, and even find voice clips of individuals.

Network security experts indicate that businesses must remain vigilant in monitoring new threats, and consider limiting the number of personnel who are authorized to access sensitive datasets and vulnerable AI tools. (Edited by Qing Songzhu)}