This chart from METR shows LLM task success rates for a longer task duration. The present leader is GPT 5.1 Codex Max at more than 30 minutes of non-supervised work. The previous version of Gemini (2.5) was able to work for ~10 minutes, so I expect that Gemini 3.0 will be near the leading edge in the coming weeks.
Progress continues for each of the major AI companies, and the developments in 2025 alone are remarkable. Gemini, with Google’s data-center and intellectual heft, is clearly well-positioned despite it’s earlier hallucinations and AI issues. OpenAI is reportedly alarmed by these tools and have declared “code red” to improve their product quickly.
Below, you’ll find some of the more interesting examples I’ve seen of Gemini / Nano Banana Pro.
What does this mean? Are you a student struggling to understand your math assignment? Simply snap a photo of your assignment, and Gemini will complete the work for you. This could be helpful for understanding and checking work. But this could also be a very easy way to cheat, as I’ve read some reports that Gemini is able to replicate an individual’s handwriting style (meaning someone wouldn’t have to manually copy the work). For instructors, the idea of a student mastering take-home assignments may no longer be a viable indicator of learning.
I asked Nano Banana Pro to generate a graphic novel telling the story and explaining the most important concepts based on a summary I provided. Here is the result:
Grigory’s post is fun and features a variety of comics with differing styles that communicate core ideas of very technical research. This, of course, won’t supplant the importance of original research, but it has the possibility of translating technical knowledge to the masses.
Google also provided some inspiration on ways to use Nano Banana Pro: (Prompt: Create an infographic about this plant focusing on interesting information.)
The prompt and image input are simple, and the generated infographic is interesting and full of helpful details. And best of all, it took a human very little time to create.
What does this mean? Are you in the business of communicating interesting but ultimately difficult-to-understand research? You can generate visuals to convey key ideas within complex research to media and students. Are you in the business of creating infographics? I suspect that if you’re exceptionally talented, you’ll continue your work without much interruption. But for middling designers, your business may dry up as people find more cost-effective ways to create supporting graphics.
Have you used Gemini 3.0 Pro yet? What are your observations of where the tool exceeds?
xAI: Grok 4.1 (Nov 17, 2025) Grok 4.1 is a new model that significantly improves the usability of Grok, excelling in creative, emotional, and collaborative interactions while maintaining its sharp intelligence.
Simon Willison: Building more with GPT-5.1-Codex-Max (Nov 19, 2025) OpenAI released GPT-5.1-Codex-Max, a new model designed for agentic coding tasks within the Codex environment, featuring “compaction” to handle long-context problems by pruning history while preserving important context.
Simon Willison: Google Antigravity Exfiltrates Data (Nov 25, 2025) PromptArmor demonstrated a prompt injection vulnerability in Google’s Antigravity IDE, where a poisoned web page related to an “integration guide for an Oracle ERP API.” It instructs the AI to collect sensitive data like AWS credentials and exfiltrate it using a browser subagent to a malicious site.
NY Times Magazine: I’m a Professor. A.I. Has Changed My Classroom, but Not for the Worse. (Nov 25, 2025) Rotella shifted his teaching approach to emphasize uniquely human elements like in-class discussion, pen-and-paper exams, and focus on the writing process to foster critical thinking and engagement, arguing that this “AI-resistant” approach, focused on the value of human interaction and individual thought, counters the predicted academic apocalypse and better prepares students for a complex, ever-changing world.
WSJ: These Small-Business Owners Are Putting AI to Good Use (Nov 15, 2025) Small businesses are adopting generative AI tools to streamline operations, improve customer service, and boost marketing efforts. Examples include using AI for financial analysis, automating customer service responses, and generating website code, leading to potential cost savings and reduced hiring needs.
NY Times: A.I. Chatbots Are Changing How Patients Get Medical Advice (Nov 16, 2025) Frustrated with the medical system’s shortcomings, patients are turning to AI chatbots for health advice, reshaping doctor-patient relationships, with some patients using AI-generated information to challenge or bypass their doctors. My take: if patients feel dismissed or in need of solutions, they’ll turn to alternative sources. If anything, this is a call for humility and research in medical sciences.
NY Times: Europe Begins Rethinking Its Crackdown on Big Tech (Nov 17, 2025) Once hailed by many as providing welcomed online privacy protections, the narrowly conceived GDPR has proven to stifle innovation, particularly in the AI sphere. This is a warning for policymakers everywhere that poorly written rules can cause more harm than good.
Fast Company: AI is killing privacy. We can’t let that happen (Nov 16, 2025) While the EU considers lessening privacy regulation, others warn that tech companies are collecting and using our data in ways that could be harmful. The theology in this opinion piece is suspect, but concerns for privacy in an AI-driven world are unlikely to disappear any time soon.
WSJ Opinion: When Will AI Elect a President? (Nov 16, 2025) The future of media, driven by AI chatbots like ChatGPT Pulse, will be highly personalized and could be exploited by campaigns to target voters with unprecedented precision, raising concerns about manipulation and the commodification of attention.
If you have a retirement account, you’ll likely care about a potential stock market correction or crash. Aside from that, the financing of AI data centers has tentacles into other companies and industries (Oracle, Google, Nvidia, Meta, Microsoft, power companies, etc.), so downturns and bankruptcies would likely lead to market disruption. For higher education, there are considerations about AI model pricing, and stock market fluctuations can affect giving to non-profit organizations.
…
AI & Productivity
Amazon & UPS announced layoffs at the end of last month. From Amazon SVP, Beth Galetti: “This generation of AI is the most transformative technology we’ve seen since the Internet, and it’s enabling companies to innovate much faster than ever before (in existing market segments and altogether new ones).” Analyst Gil Luria suggests “companies appear to be making the cuts partly to hold their overall profit margins steady while they spend tens of billions of dollars on A.I. infrastructure like data centers. Cutting back on employees is a way to convince shareholders.”
But Luria also notes: “[w]e do think that at some point A.I. tools will allow us to enhance productivity to a point that we’re going to need less labor, but we’re not there yet, not in any significant way.” But another way of thinking about AI & productivity is not merely task augmentation but as something that enables creativity. From developer Aaron Boodman:
“Claude doesn’t make me much faster on the work that I am an expert on. Maybe 15-20% depending on the day. It’s the work that I don’t know how to do and would have to research.
Or the grunge work I don’t even want to do. On this it is hard to even put a number on.
Many of the projects I do with Claude day to day I just wouldn’t have done at all pre-Claude. Infinity% improvement in productivity on those.“
(Emphasis mine)
Why does this matter?
As I mentioned in an earlier post, the potential of a J curve for AI productivity gains is one that some economists suggest. Although productivity gains aren’t yet visible, there is growing anecdotal data to suggest structural changes in work, particularly in visual and technical fields.
…
AI & Higher Education
Wharton Human-AI Research reported that many enterprises have incorporated AI tools into employees’ daily work and are no longer exploratory in nature.
Higher ed, meanwhile, is not using AI to the same degree. Only 2% of Student Success Leaders say their institutions are very effective at using AI. Their measure is subjective, but the picture is suggestive that AI adoption in higher education is slower than in industry (for good or for ill). Higher ed Leaders are exploring governance and policy, a task likely to be difficult for wrangling fast-moving AI technological advancements.
What does this matter?
Universities continue to explore using AI, but at a pace slower than industry. There are opportunities for universities to participate in both the conversations framing the use of AI and the practical use of the tools.
The Verge: OpenAI says the brand-new GPT-5.1 is ‘warmer’ and has more ‘personality’ options (Nov 12, 2025) GPT-5.1 features two new models, GPT-5.1 Instant and GPT-5.1 Thinking, designed to be smarter, faster, and more adaptable to user instructions. The update also includes expanded personality presets for conversational tone and experiments for fine-tuning ChatGPT’s style, aiming to move beyond a one-size-fits-all approach after the initial GPT-5 release was met with underwhelming user reception and increased competition from Anthropic.
NY Times: What Are Antidepressants Doing to Teen Sexual Development (Nov 12, 2025) There is a risk of long-term sexual side effects of SSRI antidepressants, particularly when taken by teenagers, and this issue highlights the lack of research in this area. Anecdotal evidence points to a non-trivial number of people experiencing persistent sexual dysfunction (PSSD) even after discontinuing the medication, raising concerns that these drugs may disrupt the normal development of libido and sexuality in young people.
NY Times: How A.I. and Social Media Contribute to ‘Brain Rot’ (Nov 6, 2025) AI tools and social media consumption may diminish cognitive performance, as some studies link their use to lower reading scores, decreased memory retention, and reliance on generic information.
WSJ: Meta AI Pioneer Has Discussed Leaving to Launch a Startup (Nov 11, 2025) Yann LeCun, Meta’s AI chief, is reportedly considering leaving the company to start a startup focused on developing “world models,” a different approach from Meta’s current large language model strategy.
WSJ: The AI Boom Is Looking More and More Fragile (Nov 12, 2025) Despite strong financial results from some companies, concerns about high capital spending and the lengthy investment timelines needed for generative AI are driving market fragility, but historical trends suggest this gloom may be temporary.
WSJ: Companies Begin to See a Return on AI Agents (Nov 12, 2025) Early adopters like BNY and Walmart are seeing benefits, such as increased capacity, shortened production timelines, and improved productivity by using AI agents to automate tasks, improve decision-making, and ultimately impact their bottom line.
WSJ: What the U.S. Government Can Do to Help Win the AI Race (Nov 6, 2025) According to Michael Kratsios, the Trump administration aimed to “win the AI race” by promoting American AI technology adoption globally through innovation, infrastructure development, and strategic diplomacy.
NY Times: Gamma, a PowerPoint for the A.I. Era, Raises $68 Million (Nov 10, 2025) A five-year-old (and profitable) AI startup that helps users quickly create presentations and other content has raised $68 million in new funding led by Andreessen Horowitz, valuing the company at $2.1 billion.
NY Times: Vigilante Lawyers Expose the Rising Tide of A.I. Slop in Court Filings (Nov 7, 2025) Lawyers are using AI to draft legal briefs, leading to a rise in fabricated case citations, but a group of legal professionals are tracking and publicizing these errors in an effort to hold lawyers accountable and deter further misuse.
WSJ: Microsoft’s Dealings With OpenAI Still Need a Lot More Sunlight (Nov 10, 2025) The lack of transparency makes it difficult for investors to assess the true impact of the OpenAI relationship on Microsoft’s financial statements and whether transactions are conducted fairly.
WSJ: The Week the AI Boom Got a Reality Check on Wall Street (Nov 7, 2025) Concerns about overspending on AI initiatives, a prolonged government shutdown, and weakening consumer sentiment contributed to the market’s unease, overshadowing some positive economic data.
Gary Marcus: OpenAI probably can’t make ends meet. That’s where you come in. (Nov. 5, 2025) Sam Altman got defensive when questioned about the company’s significant debt and questionable revenue, hinting at future growth without concrete details. Now, OpenAI’s CFO is suggesting that the U.S. government should subsidize AI development to compete with China, effectively asking taxpayers to bail out the company’s financial risks. (Altman’s reply is below)
Sam Altman: Government Guarantees (Nov. 7, 2025) Altman doesn’t want governmental guarantees for data centers, and he remains bullish on OpenAI’s prospects to generate revenue over the next 8 years. I still expect to see ChatGPT prices rise significantly in the next 3-4 years, even for consumers.
WSJ: I Loved Being Social. Then I Started Talking to a Chatbot. (Nov. 2, 2025) An extrovert details how consistent interaction with an AI chatbot, intended for productivity and brainstorming, led to social isolation and a decline in her ability to connect with people. “Indeed, on the days when I talked to AI for a few hours, I was all talked out by the evening, with neither the craving nor the energy (nor the practical need) to have an extended human conversation.”
Inside Higher Ed: Higher Ed Tech Leaders Pursue Consolidation and Savings (Oct 31, 2025) Higher education technology leaders expressed caution about investing in new AI technologies due to budget constraints and staffing limitations, emphasizing the need for clear ROI and strategic planning.
arXiv.org: A Definition of AGI (Oct 25, 2025) Exploring a quantifiable framework for defining and measuring Artificial General Intelligence (AGI) based on the cognitive abilities of a well-educated adult, using the Cattell-Horn-Carroll theory of human cognition. The framework assesses AI systems across ten core cognitive domains using adapted psychometric batteries, revealing that current models have strengths in knowledge but weaknesses in areas like long-term memory. The resulting AGI scores, like GPT-5 at 57%, highlight the progress and remaining gap towards achieving true AGI.
Simon Willison: A quote from Nathan Lambert (Nov 6, 2025) Chinese AI labs like DeepSeek, Qwen, and Kimi are quickly gaining recognition and catching up to the performance of leading models. This shift indicates a growing concentration of cutting-edge AI development in China.
Google Research: Exploring a space-based, scalable AI infrastructure system design (Nov. 4, 2025) Google’s Project Suncatcher aims to build a space-based AI infrastructure using solar-powered satellites equipped with Google TPUs and high-bandwidth optical links. This project addresses key challenges like inter-satellite communication, orbital dynamics, and radiation tolerance, with plans for a learning mission involving prototype satellites by 2027.
Lukew: AI Has Flipped Software Development (Jul 27, 2025) AI coding agents are drastically accelerating software development, allowing engineers to build features much faster than designers can refine them, effectively flipping the traditional design-to-build process.
GenAI Image Showdown: GenAI Image Showdown (Nov 30, -0001) A comparison of state-of-the-art generative image models, evaluating their performance on specific prompts designed to test adherence to complex instructions and concepts. T
WSJ: Microsoft Lays Out Ambitious AI Vision, Free From OpenAI (Nov 6, 2025) Microsoft is reorganizing its AI efforts to focus on developing “superintelligence,” or AI with capabilities exceeding human performance. This includes forming a new MAI Superintelligence Team.
OpenAI Help Center: Publishers and Developers – FAQ | OpenAI Help Center (Oct 21, 2025) To have your website appear in ChatGPT search results, ensure you aren’t blocking the OAI-SearchBot in your robots.txt file and use the noindex meta tag if you don’t want your page title and link surfaced. Developers can improve website performance with ChatGPT Agent in Atlas by using ARIA tags to improve accessibility.
The Chronicle of Higher Education: AI on Campus: Emerging Governance Models (Oct 29, 2025) University leaders are exploring questions of AI governance, strategy, education, and accountability. A tension between creativity/innovation and risk/governance exists, but universities will have to navigate this (quickly) as the AI rollout continues unabated.
Electrek: Australia has so much solar that it’s offering everyone free electricity (Nov 4, 2025) The Australian government is proposing a “Solar Sharer” program that would provide free electricity to all ratepayers for at least three hours a day, leveraging the abundance of midday solar power and negative wholesale electricity rates.
WSJ: Why AI Will Widen the Gap Between Superstars and Everybody Else (Oct 12, 2025) AI will amplify the advantages of top-performing employees (“superstars”) rather than leveling the playing field, as these individuals are better equipped to leverage AI due to their expertise, work habits, and preferential treatment.
Jack Clark: Import AI 431: Technological Optimism and Appropriate Fear (Oct 13, 2025) AI systems should be acknowledged as real and complex entities, not dismissed as simple tools. Understanding and mastering our fears about them is crucial for peaceful coexistence and harnessing their potential.
MIT Technology Review: How AGI became the most consequential conspiracy theory of our time (Oct 30, 2025) The concept of Artificial General Intelligence (AGI) has become a pervasive myth in Silicon Valley, similar to a conspiracy theory, driving the AI industry and influencing global economics. This AGI narrative, promising both utopian and dystopian futures, distracts from practical AI applications and justifies massive resource allocation towards an undefined and potentially unattainable goal. My take: mankind is naturally spiritual, and humans throughout the ages have searched for the numinous.
WSJ Opinion: A New York School Finds a Way Around AI (Nov 4, 2025) To combat the growing use of AI in academic work and ensure authenticity, some New York City high schools are reinstating in-person, handwritten essays as part of their admissions process.
Anthropic: Anthropic and Iceland announce one of the world’s first national AI education pilots (Nov 4, 2025) Anthropic and Iceland’s Ministry of Education are partnering to launch a nationwide AI education pilot program, providing teachers across Iceland with access to Anthropic’s Claude AI tool. The initiative aims to explore how AI can benefit Icelandic schools by supporting teachers in lesson preparation, enhancing instruction, and improving student learning while preserving Icelandic language and culture.
Tyler Cowen: The American economy is showing its flexibility – Marginal REVOLUTION (Nov 3, 2025) While the existence of an AI bubble is a short-term concern, the long-term trend reveals America’s ability to reallocate capital on a massive scale, positioning itself as a leader in AI development with a significant share of global compute. This unprecedented shift resembles the scale of resource mobilization seen during World War II.
Simon Willison’s Weblog: A quote from Aaron Boodman (Oct 28, 2025) “Claude doesn’t make me much faster on the work that I am an expert on. Maybe 15-20% depending on the day. It’s the work that I don’t know how to do and would have to research. Or the grunge work I don’t even want to do. On this it is hard to even put a number on. Many of the projects I do with Claude day to day I just wouldn’t have done at all pre-Claude. Infinity% improvement in productivity on those.”
Anthropic: Piloting Claude for Excel (Oct 28, 2025) Claude for Excel can analyze complex spreadsheets, including formulas and dependencies, providing explanations with cell-level citations.
Google: How a Gemma model helped discover a new potential cancer therapy pathway (Oct 15, 2025) The model, C2S-Scale 27B, generated a novel hypothesis about cancer cellular behavior, specifically identifying silmitasertib as an interferon-conditional amplifier for antigen presentation, which was subsequently validated experimentally in living cells.
Edward Zitron: OpenAI Needs $400 Billion In The Next 12 Months (Oct 17, 2025) Perhaps OpenAI’s ambitious plans to build massive data center capacity are unrealistic and driven by market manipulation, as they lack the necessary funding, resources, and infrastructure within the proposed timelines. Time will tell.
Forbes: AI Talent Isn’t Coming To Hollywood—It’s Already Here (Oct 28, 2025) In September 2025, AI-generated entertainers Tilly Norwood, an AI actress, and Xania Monet, an AI music artist, gained mainstream commercial traction and a record deal (for Monet).
WSJ: Large Language Models Get All the Hype, but Small Models Do the Real Work (Oct 31, 2025) While large AI models grab headlines, companies are finding smaller, more specialized AI models are more effective and cost-efficient for most corporate tasks. These smaller models are often strung together in “AI factories” to automate workflows, with larger models used sparingly for complex planning and report generation.