AI isn’t just improving—it’s accelerating. A few key measures:
- According to the Stanford Institute for Human‑Centered Artificial Intelligence’s AI Index 2025, key benchmarks like MMLU (multitask language understanding) and GPQA (general purpose question answering) have improved dramatically in the last year.
- For example, inference cost at the level of GPT-3.5 dropped from around $20 per million tokens in late 2022, to about $0.07 per million tokens by late 2024 (a ~280× reduction) thanks to hardware & algorithmic advances.
- Multimodal AI is advancing: models now increasingly integrate text, image, audio, video in unified frameworks (so‐called “foundation models” spanning modalities).
- Edge and on-device AI are emerging: researchers note a growing push toward running AI inference on devices (IoT, mobile) for latency/ privacy/efficiency advantages. This means that AI systems you may be responsible for developing or managing (or integrating) are becoming more capable, cheaper, and more ubiquitous. It’s no longer just about “let’s experiment with AI” but “how do we industrialize AI as part of the product or service stack”.
Investment in AI remains huge. The U.S. alone had ~US$109 billion private investment in AI in 2024. AI adoption is accelerating: In one study, ~78% of organizations reported using some form of AI in 2024, up from ~55% the year before. Real‐world impact spans many domains: from healthcare (AI-enabled medical devices, diagnostics) to autonomous vehicles/robot-taxis.
While the U.S. remains ahead in the number of large model releases (40 in 2024 vs China’s 15 vs Europe’s 3) in one dataset. Tech, Science & Energy Center. But China is closing the gap in performance and open-model development. On the hardware side: specialized chips (TPUs, ASICs), neuromorphic approaches, and edge-optimized hardware are becoming strategic.
Why this matters: The combination of compute infrastructure + data + algorithmic breakthroughs remains the bottleneck. Your strategic planning should factor in not just “Use an AI API” but “What infrastructure, latency, cost and data constraints apply?”
Here are some developments to keep an eye on:
-
Foundation models becoming ever more general-purpose: Models that handle text, vision, audio, video, reasoning in one unified architecture.
-
Edge & embedded AI growth: Running not just in the cloud but on devices, sensors, vehicles — giving lower latency, better privacy.
-
AI + automation in the enterprise: Not just chatbots, but full-stack automation: code generation, operational decision support, intelligent agents.
-
Regulatory regimes and standardisation: Compliance will become a differentiator. If your system is “AI-enabled”, the regulator may treat it differently.
-
Sustainability & ethics as differentiators: Projects with transparent data usage, lower carbon footprint, “explainability” — these may become preferred by customers/partners.
-
Geopolitics of AI: How global supply-chains, compute infrastructure (chips, data centres), and regulatory alignment will shape opportunities and constraints.
References:
-
“Artificial Intelligence Index Report 2025” (Stanford HAI) – e.g., benchmark gains like MMLU and GPQA cited in summary of highlights. Stanford 2025 AI Index Report
-
Inference cost drop from ~$20 to ~$0.07 per million tokens, ~280-fold reduction. AIWEDO.COM
-
On-device/edge AI and multimodal integration are increasingly noted; see commentary on cost/efficiency and inference context lengths. Techopedia
-
U.S. private AI investment reached ~$109.1 billion in 2024; 78% of organisations using AI in 2024 (up from 55%). IT Brief Asia
- “Is AI Cheap Now? Inside the AI Inference Cost Crash.” Techopedia




