Anthropic says its latest AI bot can beat Gemini and ChatGPT

內容

Anthropic, the AI company started by several former OpenAI employees, says the new Claude 3 family of AI models performs as well as or better than leading models from Google and OpenAI. Unlike earlier versions, Claude 3 is also multimodal, able to understand text and photo inputs.

正在翻譯...

Anthropic says Claude 3 will answer more questions, understand longer instructions, and be more accurate. Claude 3 can understand more context, meaning it can process more information. There’s Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, with Opus being the largest and “most intelligent model.” Anthropic says Opus and Sonnet are now available on claude.ai and its API. Haiku will be released soon. All three models can be deployed on chatbots, auto-completion, and data extraction tasks.

正在翻譯...

Previous versions of Claude refused to answer some prompts that were harmless, which the company writes “suggests a lack of contextual understanding.” The new models are less likely to refuse to answer prompts that toe the line of its safety guardrails, similar to rumors about Meta’s plans for Llama 3 when it’s released.

正在翻譯...

Bar chart showing a significantly lower rate of “refused on harmless prompt” responses by Claude 3 AI models (near or below 10 percent), compared to Claude 2.1 (around 25 percent).

Incorrect refusals on Claude 3 versus Claude 2.1.

正在翻譯...

Image: Anthropic

正在翻譯...

Anthropic claims Claude 3 models can give near-instant results even while parsing dense material like a research paper. A blog post says Haiku, the smallest version of Claude 3, is “the fastest and most cost-effective model on the market,” able to read a dense research paper complete with charts and graphs “in less than three seconds.”

正在翻譯...

Anthropic says Opus outperformed most models in several benchmarking tests. It showed better graduate-level reasoning than OpenAI’s GPT-4, getting 50.4 percent in that test over GPT-4’s 35.7 percent. It also answered math questions, coded, and understood reasoning better.

正在翻譯...

A list of benchmark scores comparing AI models from Anthropic, OpenAI, and Google, showing Claude 3 (Opus) as the highest scoring model on all of the tests listed.

Claude 3 models compared to GPT-4, GPT-3.5, and Gemini 1.0 Ultra / Pro.

正在翻譯...

Image: Anthropic

正在翻譯...

The new models also significantly improve against the previous Claude 2.1 model. Sonnet, the middle ground model, was twice as fast as Claude 2 and Claude 2.1. “It excels in tasks demanding rapid responses, like knowledge retrieval or sales automation,” Anthropic said.

正在翻譯...

Anthropic trained the Claude 3 models on a mix of nonpublic internal and third-party datasets and publicly available data as of August 2023. The company says in a paper introducing the three models that these were trained using hardware from Amazon’s AWS and Google Cloud. Both companies invested in Anthropic, with Amazon putting $4 billion into the company. Claude 3 will be available on AWS’s model library Bedrock and in Google’s Vertex AI.

正在翻譯...

Featured Videos From The Verge

正在翻譯...

TikTok Hearing

正在翻譯...

TikTok Hearing

正在翻譯...