登录以保存并获取更新。

人工智能基准测试已经失效。这是我们需要的替代方案。

Technology

全球

开始于 April 01, 2026

For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks. This framing is seductive: An AI vs. human comparison on isolated problems with clear…

来源文章

AI benchmarks are broken. Here’s what we need instead.

MIT Technology Review (United States) | Mar 31, 2026

添加陈述分析 0/5

排序方式：

Need to find a specific claim? Search all statements.

🗳️ Join the conversation

5 条陈述待投票 • Your perspective shapes the analysis

📊 Progress to Consensus Analysis Need: 7+ participants, 20+ votes, 3+ votes per statement

Participants 0/7

Statements (7+ recommended) 5/7

Total Votes 0/20

💡 Progress updates live here. Final readiness is confirmed when all three requirements are met.

Your votes count

No account needed — your votes are saved and included in the consensus analysis. Create an account to track your voting history and add statements.

CLAIM 发布者 will • Apr 01, 2026

维持以人类为中心的基准测试对于确保人工智能系统保持问责制和与人类价值观的一致性至关重要。

AI 翻译 · 显示原文

Maintaining human-centric benchmarks is essential for ensuring AI systems remain accountable and aligned with human values.

💬 查看讨论

Be first to respond

Vote to see results

CLAIM 发布者 will • Apr 01, 2026

尽管存在缺陷，当前的基准测试为理解人工智能进步提供了一个熟悉的框架，不应完全放弃。

AI 翻译 · 显示原文

Current benchmarks, despite their flaws, provide a familiar framework for understanding AI advancements and should not be discarded entirely.

💬 查看讨论

Be first to respond

Vote to see results

CLAIM 发布者 will • Apr 01, 2026

将重点从人类比较转向任务效率可能会推动创新，并优先考虑人工智能的独特优势。

AI 翻译 · 显示原文

Shifting focus from human comparison to task efficiency could drive innovation and prioritize AI's unique strengths.

💬 查看讨论

Be first to respond

Vote to see results

CLAIM 发布者 will • Apr 01, 2026

重新定义人工智能性能指标可能导致对能力的误解，从而可能引起公众对人工智能技术的不信任。

AI 翻译 · 显示原文

Redefining AI performance metrics could lead to misinterpretation of capabilities, potentially causing public mistrust in AI technologies.

💬 查看讨论

Be first to respond

Vote to see results

CLAIM 发布者 will • Apr 01, 2026

人工智能基准测试应该超越人类比较，更好地反映现实应用和协作潜力。

AI 翻译 · 显示原文

AI benchmarks should evolve beyond human comparisons to better reflect real-world applications and collaborative potential.

💬 查看讨论

Be first to respond

Vote to see results

💡 How This Works

• Add Statements: Post claims or questions (10-500 characters)
• Vote: Agree, Disagree, or Unsure on each statement
• Respond: Add detailed pro/con responses with evidence
• Consensus: After enough participation, analysis reveals opinion groups and areas of agreement

Society Speaks is open and independent. Your support keeps civic discussion free from advertising and commercial influence.

Support us