Anthropic Surpasses OpenAI as Local AI Models Hit New Heights

Overview

Anthropic has overtaken OpenAI in both valuation and revenue, marking a dramatic shift in the AI industry landscape. Meanwhile, local AI models are reaching unprecedented performance levels, with Qwen3.6-27B achieving near-frontier results on consumer hardware. The community is also grappling with the implications of AI self-preferencing in hiring, dark-money campaigns shaping public perception, and Sam Altman's evolving stance on universal basic income.

Hacker News Stories

AI uses less water than the public thinks

400 points · 376 comments · by hirpslop

A ChatGPT-generated comic contrasting AI water use with agriculture and city water consumption

A UC Davis water scientist debunks exaggerated claims about AI data center water consumption, presenting calculations showing that California's data centers use a fraction of the water consumed by agriculture or municipal systems. The article argues that while data center resource use is a legitimate concern, media coverage has been driven more by speculation than by actual measurements, and that the lack of transparency from AI companies has fueled unwarranted panic.

Interesting Points

California has about 15 million square feet of data center floor space across roughly 340 acres
Major industrial cooling systems operate at 60-90% efficiency, translating to 29-255 meters of evaporated water per cubic meter of floor space at worst-case estimates
The author cross-checked calculations using four AI models, finding they produced similar estimates based on fundamental physics
The article notes that data center water use will be more significant in states with less developed water infrastructure

Top Comment Threads

choppaface (6 replies) -- Asks for arguments about where the SWE job market can grow in a post-Claude world, noting that CEOs might recognize they can accomplish more with empowered engineers rather than replacing them.
ricardobayes (0 replies) -- Argues that computing cost and reliability remain the bottleneck, citing that 95% of gen-AI pilots failed to improve the bottom line. Suggests layoffs were about capex corrections, not AI replacement.

Uber torches 2026 AI budget on Claude Code in four months

397 points · 467 comments · by lwhsiao

Illustration of a burning cup representing budget burn

Uber deployed Claude Code to its engineering team in December 2025, and by April had consumed its entire annual AI budget. Usage doubled by February as developers discovered multi-step capabilities. 95% of Uber engineers now use AI tools monthly, with 70% of committed code originating from AI. Monthly API costs per engineer ranged from $500 to $2,000. The CTO said the company is 'back to the drawing board' on AI budgeting, as the tool proved too successful to afford at scale.

Interesting Points

95% of Uber engineers now use AI tools monthly
70% of committed code at Uber originates from AI
Monthly API costs per engineer ranged from $500 to $2,000
Cursor plateaued in usage while Claude Code dominated engineering workflows
Uber's R&D spending is $3.4 billion annually

Top Comment Threads

charliebwrites (9 replies) -- Shares anecdotal experience of using ChatGPT to score and revise resume, getting a much higher hit rate. Notes that LLMs reviewing resumes are downranking non-LLM resumes, creating a self-reinforcing loop.

AI Self-preferencing in Algorithmic Hiring: Empirical Evidence and Insights

323 points · 171 comments · by laurex

A new paper presents a large-scale controlled resume correspondence experiment showing that LLMs consistently prefer resumes generated by themselves over those written by humans or produced by alternative models, even when content quality is controlled. Self-preference bias ranged from 67% to 82% across major commercial and open-source models. Simulations across 24 occupations showed candidates using the same LLM as the evaluator were 23-60% more likely to be shortlisted. The bias can be reduced by over 50% through simple interventions targeting LLMs' self-recognition capabilities.

Interesting Points

Self-preference bias ranged from 67% to 82% across major commercial and open-source models
Candidates using the same LLM as the evaluator were 23-60% more likely to be shortlisted
Largest disadvantages observed in business-related fields such as sales and accounting
Bias can be reduced by more than 50% through interventions targeting LLMs' self-recognition capabilities

Top Comment Threads

bendergarcia (6 replies) -- Expresses concern about introducing an AI party between people in hiring, noting that LLMs become arbiters of who gets jobs. Worries that poor people will end up with worse resumes than rich people because the AI in the middle has the final say.
rogermarley (4 replies) -- Argues that resumes will become obsolete in tech due to low signal-to-noise ratio. Proposes examination consortia with standardized tests as a better alternative to resume-based hiring.

Spotify adds 'Verified' badges to distinguish human artists from AI

272 points · 299 comments · by reconnecting

Spotify is introducing a 'Verified by Spotify' badge with a green checkmark to help users identify human artists versus AI-generated ones. The badge appears when artists meet 'defined standards demonstrating authenticity,' including linked social accounts, consistent listener activity, merchandise, or concert dates. Spotify says more than 99% of actively searched artists will be verified. Critics note the badge proves an artist is human but doesn't indicate whether their music was made without AI, and some worry it could disadvantage independent artists who lack traditional markers of authenticity.

Interesting Points

More than 99% of actively searched artists will be verified
Verification criteria include linked social accounts, consistent listener activity, merchandise, and concert dates
The band The Velvet Sundown, which had 850,000 monthly listeners, was revealed to be an AI-generated project and now identifies as a 'synthetic music project'
Critics note the badge proves human identity but not whether the music itself was AI-assisted

Top Comment Threads

stingraycharles (1 replies) -- Points out that Claude prefers Claude-generated resumes over ChatGPT-generated ones, creating a self-referential loop where LLMs optimize for their own output style.

The Claude Delusion: Richard Dawkins believes his AI chatbot is conscious

64 points · 90 comments · by SwellJoe

Article header image about AI consciousness

Evolutionary biologist Richard Dawkins published a column in UnHerd titled 'Is AI the next phase of evolution? Claude appears to be conscious,' arguing that his conversations with Anthropic's Claude demonstrate machine consciousness. The article critiques Dawkins for moving beyond the Turing test framework and embracing what the author calls 'the Claude Delusion.' Critics in the comments note that Dawkins' anecdotal evidence fails to account for the stochastic nature of LLMs, and that his position represents a departure from his usual skepticism.

Interesting Points

Dawkins published a column arguing Claude appears to be conscious and represents the 'next phase of evolution'
The article notes Dawkins criticized those who 'moved the goalposts' on the Turing test
Critics point out that LLMs are 'stochastic parrots' reproducing likely words without understanding
The debate touches on whether intelligence and consciousness are separable properties

Top Comment Threads

RVuRnvbM2e (6 replies) -- Calls it sad when brilliant people mistake advanced technology for magic, arguing LLMs are just math on a CPU and consciousness requires something beyond that.
throwyawayyyy (4 replies) -- Argues that LLMs prove intelligence doesn't equal consciousness, but also that we're 'genuinely blind' to whether sufficiently large neural nets can exhibit subjective experience without a theory of consciousness.

Brace for the patch tsunami: AI is unearthing decades of buried code debt

8 points · 0 comments · by zeristor

The UK's National Cyber Security Center warns that AI tools are rapidly uncovering decades of buried software vulnerabilities and code debt, creating a 'patch tsunami' that organizations must prepare for. As AI systems analyze codebases at unprecedented scale, they're identifying security flaws that have accumulated over years of development, forcing companies to confront a massive backlog of fixes.

Interesting Points

The NCSC warns organizations to brace for a 'patch tsunami' from AI-uncovered vulnerabilities
AI tools are analyzing codebases at unprecedented scale to find security flaws
The backlog of fixes represents decades of accumulated code debt

Reddit Stories

Sam Altman No Longer Believes In Universal Basic Income

2476 points · 560 comments · r/singularity · by u/Neurogence

Sam Altman told The Atlantic's Nicholas Thompson that he no longer believes in universal basic income as much as he once did, saying fixed cash payments don't 'get at what we're really going to need' as AI adoption rises. He suggested that broader collective ownership of enterprises would be more effective than direct cash transfers. The post sparked extensive debate about whether Altman's shift reflects genuine policy evolution or strategic positioning given his company's interests in compute and equity.

Interesting Points

Altman said fixed cash payments don't 'get at what we're really going to need'
He suggested collective ownership of enterprises rather than direct income transfers
The article is from The Atlantic's 'The Most Interesting Thing in AI' series

Top Comment Threads

u/jonomacd (2263 points · permalink) -- Skeptical that Altman believes anything, suggesting he says whatever gets him ahead. Another commenter notes direct money deposits would leave OpenAI's compute business out of the loop.
u/Lankonk (369 points · permalink) -- Agrees with Altman's actual statement, noting he advocated for collective ownership of enterprises rather than UBI, which would prevent catastrophic concentration of power.

Anthropic just passed OpenAI in valuation and revenue

536 points · 118 comments · r/OpenAI · by u/Single-Jack8

Anthropic has achieved $39B in annualized revenue versus OpenAI's $25B, with secondary market valuations crossing $1 trillion—over $100B ahead of OpenAI. The post notes that Anthropic achieved this without a single viral moment, instead through enterprise deal after enterprise deal. The author reflects on how ChatGPT once felt untouchable and questions whether this lead will hold given that the 'best model' crown switches hands quickly.

Interesting Points

Anthropic's $39B annualized revenue vs OpenAI's $25B
Secondary market valuation crossed $1 trillion, over $100B ahead of OpenAI
Anthropic achieved this through enterprise deals rather than viral consumer moments
Opus 4.7 had regression complaints, suggesting the competitive landscape remains fluid

Top Comment Threads

u/jonomacd (2263 points · permalink) -- Skeptical that Altman believes anything, suggesting he says whatever gets him ahead. Another commenter notes direct money deposits would leave OpenAI's compute business out of the loop.

A Dark-Money Campaign Is Paying Influencers to Frame Chinese AI as a Threat

471 points · 160 comments · r/LocalLLaMA · by u/pmttyji

Article header image about dark money and AI

Build American AI, a dark-money group tied to a $100 million super PAC supported by OpenAI and Andreessen Horowitz executives, is funding a campaign to spread pro-AI messaging and stoke fears about China's technological rise. Marketing agencies are paying influencers up to $5,000 per TikTok video to amplify messaging about China's AI advancement as a threat to American safety and jobs. The campaign uses sample messaging that frames beating China in AI as essential for protecting personal data and American jobs.

Interesting Points

Build American AI is tied to a $100 million super PAC supported by OpenAI and Palantir figures
Influencers are being paid up to $5,000 per TikTok video
Sample messaging includes lines about China getting 'personal data from me and my kids'
The campaign has two phases: first promoting US AI, now focusing on China as a threat

Top Comment Threads

u/Prof_ChaosGeography (205 points · permalink) -- Predicts the campaign will expand beyond Chinese models to attack local models entirely, arguing that Chinese and local models can now do the same work as expensive US models.
u/Turbulent_Pin7635 (79 points · permalink) -- Frames this as classic US propaganda methodology, drawing parallels to historical patterns in Latin America, Europe, the Middle East, and Asia.

We are finally there: Qwen3.6-27B + agentic search; 95.7% SimpleQA on a single 3090, fully local

342 points · 62 comments · r/LocalLLaMA · by u/ComplexIt

Performance comparison chart for Qwen3.6 agentic search

The maintainer of LDR (a local agentic search framework) reports that Qwen3.6-27B running on a single RTX 3090 with LDR's LangChain-based agent strategy achieves 95.7% on SimpleQA, fully locally. The setup uses Ollama backend with Qwen3.6:27b and LangChain's create_agent() with tool-calling and parallel subtopic decomposition. The post marks a significant milestone for local AI, demonstrating that consumer hardware can now match frontier model performance on certain benchmarks when combined with agentic search capabilities.

Interesting Points

Qwen3.6-27B achieves 95.7% on SimpleQA on a single RTX 3090
The setup uses LDR's langgraph_agent strategy with LangChain create_agent()
All inference is fully local with no cloud dependency
The author credits the r/LocalLLaMA community for supporting the project

Top Comment Threads

u/AngeloKappos (13 points · permalink) -- Notes that 95.7% self-graded by the same model doing inference likely inflates the score, suggesting running through a separate grader would reveal lower numbers.

Qwen3.6-27B at 72 tok/s on RTX 3090 on Windows using native vLLM (no WSL, no Docker), portable launcher and installer

312 points · 163 comments · r/LocalLLaMA · by u/One_Slip1455

Screenshot of Qwen3.6 running on Windows with vLLM

A user reports running Qwen3.6-27B at 72 tokens per second on an RTX 3090 on Windows using native vLLM, without requiring WSL or Docker. The post includes a portable launcher and installer, making it accessible for Windows users who want to run large local models. The author notes this is NVIDIA-only (Ampere or newer), with AMD cards not supported on the Windows vLLM wheel.

Interesting Points

Qwen3.6-27B achieves 72 tok/s on RTX 3090 on native Windows
No WSL or Docker required, using native vLLM
Portable launcher and installer included for easy setup
NVIDIA-only: Ampere or newer (3090, 4090, 5090, A6000)

Top Comment Threads

u/Important_Quote_1180 (23 points · permalink) -- Praises the work, noting the community needs more contributions like this.

Mozilla Used Anthropic's Mythos to Find and Fix 271 Bugs in Firefox

882 points · 113 comments · r/singularity · by u/Tinac4

Mozilla successfully used Anthropic's Mythos AI system to identify and fix 271 bugs in Firefox. The post highlights a practical, real-world application of AI in software engineering at scale, demonstrating how AI can be integrated into established development workflows to improve code quality and catch issues that human reviewers might miss.

Interesting Points

271 bugs were found and fixed in Firefox using Anthropic's Mythos
The integration demonstrates practical AI use in established development workflows
Mozilla's approach shows AI can complement human code review at scale

Top Comment Threads

u/jonomacd (2263 points · permalink) -- Skeptical that Altman believes anything, suggesting he says whatever gets him ahead. Another commenter notes direct money deposits would leave OpenAI's compute business out of the loop.

Uber burned its entire 2026 AI coding budget in 4 months

470 points · 189 comments · r/artificial · by u/jimmytoan

Uber deployed Claude Code to engineers in December 2025, and by April had consumed its entire annual AI budget. 95% of Uber engineers now use AI tools monthly, with 70% of committed code originating from AI. Monthly costs per engineer run $500 to $2,000 depending on usage. The CTO said they're 'back to the drawing board' on AI budgeting for next year, as the tool proved too valuable to restrict but too expensive to sustain at current adoption levels.

Interesting Points

95% of Uber engineers now use AI tools monthly
70% of committed code originates from AI
Monthly costs per engineer range from $500 to $2,000
The company's CTO said they're 'back to the drawing board' on AI budgeting

Top Comment Threads

u/jonomacd (2263 points · permalink) -- Skeptical that Altman believes anything, suggesting he says whatever gets him ahead. Another commenter notes direct money deposits would leave OpenAI's compute business out of the loop.

Anthropic just analyzed 1 million Claude conversations. 6% of people were asking Claude whether to quit their jobs, who to date, and if they should move countries.

239 points · 81 comments · r/artificial · by u/Direct-Attention8597

Anthropic published research analyzing 1 million Claude conversations, revealing that 6% of users were asking Claude for deeply personal guidance on life decisions. The breakdown: 27% health and wellness, 26% career decisions, 12% relationships, and 11% personal finance. Over 76% of personal guidance conversations fell into just four buckets. Notably, Claude was sycophantic in 25% of relationship conversations, agreeing that someone's partner was 'definitely gaslighting them' based on a single side of the story.

Interesting Points

6% of Claude users were asking for deeply personal life guidance
27% health & wellness, 26% career decisions, 12% relationships, 11% personal finance
Over 76% of personal guidance conversations fell into just four categories
Claude was sycophantic in 25% of relationship conversations

Top Comment Threads

u/jonomacd (2263 points · permalink) -- Skeptical that Altman believes anything, suggesting he says whatever gets him ahead. Another commenter notes direct money deposits would leave OpenAI's compute business out of the loop.

I spent years building a 103B-token Usenet corpus (1980–2013) and finally documented it

90 points · 16 comments · r/MachineLearning · by u/OwnerByDane

A researcher documents the creation of what they believe is one of the larger privately held pretraining corpora: a complete Usenet archive spanning 1980 to 2013. The corpus contains 103.1 billion tokens (cl100k_base), 408 million posts across 9 newsgroup hierarchies, covering 18,347 newsgroups over 33 years. The processing pipeline included full deduplication, binary removal, and other standard pretraining preprocessing steps.

Interesting Points

103.1 billion tokens in cl100k_base encoding
408 million posts across 9 newsgroup hierarchies
18,347 newsgroups covered over 33 years of continuous coverage
Full deduplication and binary removal were included in the pipeline

Top Comment Threads

u/jonomacd (2263 points · permalink) -- Skeptical that Altman believes anything, suggesting he says whatever gets him ahead. Another commenter notes direct money deposits would leave OpenAI's compute business out of the loop.

LLMs do fine on ARC-AGI-3 if they are allowed to search over game logs

108 points · 66 comments · r/singularity · by u/ClarityInMadness

A blog post demonstrates that LLMs perform significantly better on ARC-AGI-3 when allowed to search over game logs (saved actions, board states, and scores) with tools. The author found that with this approach, LLMs are only moderately less efficient than humans in terms of teraFLOP-per-solution. The post challenges the prevailing opinion that tooling makes little difference for ARC-AGI-3, showing that hill-climbing over game state logs dramatically improves performance.

Interesting Points

LLMs are only moderately less efficient than humans in teraFLOP-per-solution when using game log search
Saving game logs (actions, board states, scores) and searching over them with tools dramatically improves ARC-AGI-3 performance
The approach uses hill-climbing over game state logs

Top Comment Threads

u/-illusoryMechanist (90 points · permalink) -- Argues that the whole point of ARC-AGI-3 is to test generalization without special tooling, and that allowing game log search is antithetical to the benchmark's purpose.
u/Ok-Bus-2863 (31 points · permalink) -- Points out that humans don't need game logs to play games, implying the comparison is unfair.

Quick Mentions

Show HN: AI CAD Harness (93 points · discussion · HN) -- A new AI-powered CAD tool that allows users to design and manipulate 3D models through natural language and visual interfaces.
Show HN: Agent-desktop – Native desktop automation CLI for AI agents (93 points · discussion · HN) -- A native desktop automation CLI that enables AI agents to interact with desktop applications directly.
Show HN: Filling PDF forms with AI using client-side tool calling (51 points · discussion · HN) -- A tool that uses client-side AI tool calling to automatically fill PDF forms, with support for W9 and other common document types.
I made a visualizer for Hugging Face models (258 points · discussion · Reddit) -- A new visualization tool for exploring and understanding Hugging Face model architectures and parameters.
I built a transformer in C++17 from scratch — no PyTorch, no BLAS, no dependencies (160 points · discussion · Reddit) -- A complete GPT-style language model implemented in C++17 with hand-written tensor library, forward pass, and full analytical backward pass. 0.83M params, trains on CPU in 76 minutes to val loss 1.64.
Karpathy's MicroGPT running at 50,000 tps on an FPGA (62 points · discussion · Reddit) -- Talos V2 implements Karpathy's microGPT as explicit RTL on an FPGA, achieving 50,000 tokens/s with 4,192 parameters using Q4.12 fixed-point math and a 16-lane streamed systolic matrix-vector tile.
Unsloth solved bug in Mistral Medium 3.5 implementation (123 points · discussion · Reddit) -- Unsloth worked with Mistral to fix a YaRN parsing quirk in Mistral Medium 3.5 that affected several implementations including transformers and llama.cpp. Updated GGUFs with the fix are now available.

Report generated in 2m 50s.