(cache)K.Ishi@生成AIの産業応用 on X: "Appleから、OpenAI o1の数学的推論能力は過大評価だという類の報告が出た。この研究では、最新のo1-previewでさえ、問題中に無関係な情報があるとそれを無視できず、大幅に精度低下することを突き止めた。これは、LLMは数学を理解せず、単にパターンマッチングに依存している可能性を意味する。 https://t.co/q3Ik1tNY3F" / X

Appleから、OpenAI o1の数学的推論能力は過大評価だという類の報告が出た。この研究では、最新のo1-previewでさえ、問題中に無関係な情報があるとそれを無視できず、大幅に精度低下することを突き止めた。これは、LLMは数学を理解せず、単にパターンマッチングに依存している可能性を意味する。

Quote

elvis

@omarsar0

19h

Has mathematical reasoning in LLMs really advanced? This study tests several SoTA models on a benchmark created with symbolic templates that enable diverse mathematical problems. They find that LLMs exhibit variance when responding to variations of the same questions. The

1:12 PM · Oct 12, 2024

63.9K

Views

171

Bookmarks

Post

Conversation