2026-02-20
๐ฐ Daily Digest โ 2026-02-20
6 items | Business, AI, DevTools
๐ Quick Summary
Jack Altmanโs 5 Lessons on Finding Product-Market Fit
Source: a16z speedrun (Substack) ยท Category: Business ยท Link: Original
- Jack Altman shared five seed-stage lessons in an a16z fireside chat.
- If PMF exists, traction appears quickly; repeated โjust one more featureโ requests often signal weak PMF.
- Early hiring should focus on โdiamonds in the roughโ rather than obvious, fully priced talent.
Cursorโs Self-Learning Coding Agents
Source: Cursor Blog ยท Category: AI ยท Link: Original
- Cursor introduced a โDynamic Context Discoveryโ pattern to reduce repeated agent mistakes.
- Converting tool outputs into files and syncing MCP tools by folders reduced token use by 46.9%.
- Agent skills are defined as files and discovered dynamically with grep and semantic search.
Lessons from Building Claude Code: Prompt Caching Is Everything
Source: Anthropic Engineering ยท Category: AI ยท Link: Original
- Anthropic explains why prompt caching is core to production-grade agents.
- Reusing prior round-trip computation dramatically lowers latency and cost.
- Cache-friendly design requires ordering static before dynamic prompt parts and keeping model/tool sets stable.
Andrej Karpathy: The Future of Customized Software
Source: X (Twitter) ยท Category: AI ยท Link: Original
- Karpathy reverse-engineered a treadmill API and built a personalized dashboard with an LLM agent in about one hour.
- He argues that app-store style discrete app selection is becoming outdated.
- Progress is constrained because most products still do not expose AI-native CLI/API interfaces.
Prompt Auto-Caching with Claude
Source: LangChain Blog ยท Category: DevTools ยท Link: Original
- Claude API now supports automatic caching, pricing cached tokens at about 10% of uncached cost.
- A single
cache_controlparameter replaces complex manual breakpoint management. - Combined with cache-friendly prompts, this can materially improve unit economics for agents.
OpenAI Prompt Caching 201
Source: OpenAI Cookbook ยท Category: DevTools ยท Link: Original
- OpenAI describes strategies that can reduce TTFL by up to 80% and input cost by up to 90%.
- Cache hits require exact prefix matching for at least 1,024 tokens, matched in 128-token blocks.
- Responses API can achieve 40-80% better cache utilization than Chat Completions.
๐ Detailed Notes
1. Jack Altmanโs 5 Lessons on Finding Product-Market Fit
Jack Altman (Lattice co-founder, Alt Capital) shared practical PMF lessons for seed-stage teams.
Five lessons
-
Hire early โdiamonds in the roughโ
- Elite candidates often have many safer options and may not join very early startups.
- Look for high-upside people the market has not fully priced in yet.
-
Real PMF shows up quickly
- Latticeโs review product was their third pivot.
- Paying customers appeared before the product was fully finished.
- Endless โone more featureโ requests usually indicate solution mismatch, not PMF.
-
Listen to customers without losing your north star
- Large near-term deals can pull the roadmap off strategy.
- Out-of-scope asks often cascade into more misaligned asks.
-
Operate proactively, not reactively
- Many founders spend most of their time reacting to external requests.
- Past ~20 employees, the CEO role shifts toward building the system that builds products.
-
No universal startup advice exists
- Advice optimized for the broadest audience often drops context.
- Follow fundraising structure, but protect timeline compression and resist pressure.
2. Cursorโs Self-Learning Coding Agents
Cursorโs Jediah Katz introduced โDynamic Context Discoveryโ to make coding agents more efficient and less error-prone.
Five implementation techniques
-
Convert tool outputs into files
- Save long shell/MCP outputs as files instead of truncating them in chat.
- Agents can retrieve only needed slices (
tail, targeted reads) without data loss.
-
Use chat history as retrievable reference
- When context windows require summarization, keep prior conversations as accessible files.
- This helps recover details lost in compression.
-
File-defined agent skill system
- Skills are defined as files in an open format.
- Agents dynamically discover relevant skills via grep and semantic search.
-
Optimize MCP loading
- Replace static inclusion of all tool descriptions with server-folder synchronization.
- Reported total token usage drop: 46.9% on MCP calls.
-
Sync terminal output to files
- Mirror integrated terminal sessions to local files.
- Enables queryable execution state without manual copy/paste.
Design principle
- Treat files as the simplest scalable primitive for LLM tooling.
3. Lessons from Building Claude Code: Prompt Caching Is Everything
Anthropic describes prompt caching as the core production primitive for long-running agents.
Core mechanics
- Reuse computation from prior round-trips to reduce both latency and cost.
- Agent systems make many sequential calls, so stable system prompts/tools/history deliver large cache reuse.
Cache-friendly design rules
- Put static content first (system prompt, tool schema), dynamic user content later.
- Avoid model switching mid-session (switches invalidate cache).
- Keep tool sets stable in membership and order.
- Use cache-preserving forking/compaction patterns.
Production implication
- This strategy is presented as one of Claude Codeโs key leverage points and reflected in the Claude Agent SDK.
4. Andrej Karpathy: The Future of Customized Software
Karpathy shares a concrete build log for highly customized AI-generated software.
Example build
- Goal: 8-week resting heart-rate experiment (50 -> 45 BPM).
- In about one hour, he used Claude to reverse-engineer Woodway treadmill APIs and build a dashboard.
- He still needed to catch and correct unit/date bugs manually.
- Total code size was around 300 lines; he estimates this used to take ~10 hours.
Two claims
-
App-store model is aging out
- Finding/installing niche apps for every tiny need becomes less natural.
- Agents generating custom apps on demand becomes the default path.
-
Need AI-native sensors/actuators
- Devices should expose machine-usable interfaces (API/CLI), not human-only frontends.
- Today, most products still expose HTML/CSS docs rather than agent-native interfaces.
5. Prompt Auto-Caching with Claude
LangChain highlights Claudeโs new auto-caching behavior.
Key points
- Cached token pricing is approximately 10% of uncached tokens.
- Developers can rely on
cache_controlinstead of manually managing breakpoints. - When paired with stable prompt structure and tool consistency, agent cost efficiency improves meaningfully.
6. OpenAI Prompt Caching 201
OpenAIโs cookbook covers advanced prompt caching tactics.
Cache hit requirements
- Exact prefix match across requests.
- At least 1,024 tokens before auto-caching can apply.
- Matching is block-based (128-token increments until first mismatch).
- Messages, images, audio, tools, and schemas all affect cache identity.
Discount examples (as cited)
- GPT-4o: 50% cached-token discount.
- gpt-4.1: 75%.
- gpt-5-nano: 90%.
- gpt-realtime (audio): 98.75%.
Operational guidance
- Intentionally exceed 1,024 tokens when stable prefix reuse is expected.
- Place stable instructions/examples/tools first, variable user input last.
- Prefer
allowed_toolsover reshuffling tool arrays. - Use
prompt_cache_keyfor request routing by shared prefixes. - Consider Responses API for higher cache utilization.
- Treat summarization/compaction as cache-breaking events and plan tradeoffs.