Yoonchul Yi
โ† Back to daily insights

2026-02-20

/

๐Ÿ“ฐ Daily Digest โ€” 2026-02-20

6 items | Business, AI, DevTools


๐Ÿ“‹ Quick Summary

Jack Altmanโ€™s 5 Lessons on Finding Product-Market Fit

Source: a16z speedrun (Substack) ยท Category: Business ยท Link: Original

  • Jack Altman shared five seed-stage lessons in an a16z fireside chat.
  • If PMF exists, traction appears quickly; repeated โ€œjust one more featureโ€ requests often signal weak PMF.
  • Early hiring should focus on โ€œdiamonds in the roughโ€ rather than obvious, fully priced talent.

Cursorโ€™s Self-Learning Coding Agents

Source: Cursor Blog ยท Category: AI ยท Link: Original

  • Cursor introduced a โ€œDynamic Context Discoveryโ€ pattern to reduce repeated agent mistakes.
  • Converting tool outputs into files and syncing MCP tools by folders reduced token use by 46.9%.
  • Agent skills are defined as files and discovered dynamically with grep and semantic search.

Lessons from Building Claude Code: Prompt Caching Is Everything

Source: Anthropic Engineering ยท Category: AI ยท Link: Original

  • Anthropic explains why prompt caching is core to production-grade agents.
  • Reusing prior round-trip computation dramatically lowers latency and cost.
  • Cache-friendly design requires ordering static before dynamic prompt parts and keeping model/tool sets stable.

Andrej Karpathy: The Future of Customized Software

Source: X (Twitter) ยท Category: AI ยท Link: Original

  • Karpathy reverse-engineered a treadmill API and built a personalized dashboard with an LLM agent in about one hour.
  • He argues that app-store style discrete app selection is becoming outdated.
  • Progress is constrained because most products still do not expose AI-native CLI/API interfaces.

Prompt Auto-Caching with Claude

Source: LangChain Blog ยท Category: DevTools ยท Link: Original

  • Claude API now supports automatic caching, pricing cached tokens at about 10% of uncached cost.
  • A single cache_control parameter replaces complex manual breakpoint management.
  • Combined with cache-friendly prompts, this can materially improve unit economics for agents.

OpenAI Prompt Caching 201

Source: OpenAI Cookbook ยท Category: DevTools ยท Link: Original

  • OpenAI describes strategies that can reduce TTFL by up to 80% and input cost by up to 90%.
  • Cache hits require exact prefix matching for at least 1,024 tokens, matched in 128-token blocks.
  • Responses API can achieve 40-80% better cache utilization than Chat Completions.

๐Ÿ“ Detailed Notes

1. Jack Altmanโ€™s 5 Lessons on Finding Product-Market Fit

Jack Altman (Lattice co-founder, Alt Capital) shared practical PMF lessons for seed-stage teams.

Five lessons

  1. Hire early โ€œdiamonds in the roughโ€

    • Elite candidates often have many safer options and may not join very early startups.
    • Look for high-upside people the market has not fully priced in yet.
  2. Real PMF shows up quickly

    • Latticeโ€™s review product was their third pivot.
    • Paying customers appeared before the product was fully finished.
    • Endless โ€œone more featureโ€ requests usually indicate solution mismatch, not PMF.
  3. Listen to customers without losing your north star

    • Large near-term deals can pull the roadmap off strategy.
    • Out-of-scope asks often cascade into more misaligned asks.
  4. Operate proactively, not reactively

    • Many founders spend most of their time reacting to external requests.
    • Past ~20 employees, the CEO role shifts toward building the system that builds products.
  5. No universal startup advice exists

    • Advice optimized for the broadest audience often drops context.
    • Follow fundraising structure, but protect timeline compression and resist pressure.

2. Cursorโ€™s Self-Learning Coding Agents

Cursorโ€™s Jediah Katz introduced โ€œDynamic Context Discoveryโ€ to make coding agents more efficient and less error-prone.

Five implementation techniques

  1. Convert tool outputs into files

    • Save long shell/MCP outputs as files instead of truncating them in chat.
    • Agents can retrieve only needed slices (tail, targeted reads) without data loss.
  2. Use chat history as retrievable reference

    • When context windows require summarization, keep prior conversations as accessible files.
    • This helps recover details lost in compression.
  3. File-defined agent skill system

    • Skills are defined as files in an open format.
    • Agents dynamically discover relevant skills via grep and semantic search.
  4. Optimize MCP loading

    • Replace static inclusion of all tool descriptions with server-folder synchronization.
    • Reported total token usage drop: 46.9% on MCP calls.
  5. Sync terminal output to files

    • Mirror integrated terminal sessions to local files.
    • Enables queryable execution state without manual copy/paste.

Design principle

  • Treat files as the simplest scalable primitive for LLM tooling.

3. Lessons from Building Claude Code: Prompt Caching Is Everything

Anthropic describes prompt caching as the core production primitive for long-running agents.

Core mechanics

  • Reuse computation from prior round-trips to reduce both latency and cost.
  • Agent systems make many sequential calls, so stable system prompts/tools/history deliver large cache reuse.

Cache-friendly design rules

  1. Put static content first (system prompt, tool schema), dynamic user content later.
  2. Avoid model switching mid-session (switches invalidate cache).
  3. Keep tool sets stable in membership and order.
  4. Use cache-preserving forking/compaction patterns.

Production implication

  • This strategy is presented as one of Claude Codeโ€™s key leverage points and reflected in the Claude Agent SDK.

4. Andrej Karpathy: The Future of Customized Software

Karpathy shares a concrete build log for highly customized AI-generated software.

Example build

  • Goal: 8-week resting heart-rate experiment (50 -> 45 BPM).
  • In about one hour, he used Claude to reverse-engineer Woodway treadmill APIs and build a dashboard.
  • He still needed to catch and correct unit/date bugs manually.
  • Total code size was around 300 lines; he estimates this used to take ~10 hours.

Two claims

  1. App-store model is aging out

    • Finding/installing niche apps for every tiny need becomes less natural.
    • Agents generating custom apps on demand becomes the default path.
  2. Need AI-native sensors/actuators

    • Devices should expose machine-usable interfaces (API/CLI), not human-only frontends.
    • Today, most products still expose HTML/CSS docs rather than agent-native interfaces.

5. Prompt Auto-Caching with Claude

LangChain highlights Claudeโ€™s new auto-caching behavior.

Key points

  • Cached token pricing is approximately 10% of uncached tokens.
  • Developers can rely on cache_control instead of manually managing breakpoints.
  • When paired with stable prompt structure and tool consistency, agent cost efficiency improves meaningfully.

6. OpenAI Prompt Caching 201

OpenAIโ€™s cookbook covers advanced prompt caching tactics.

Cache hit requirements

  • Exact prefix match across requests.
  • At least 1,024 tokens before auto-caching can apply.
  • Matching is block-based (128-token increments until first mismatch).
  • Messages, images, audio, tools, and schemas all affect cache identity.

Discount examples (as cited)

  • GPT-4o: 50% cached-token discount.
  • gpt-4.1: 75%.
  • gpt-5-nano: 90%.
  • gpt-realtime (audio): 98.75%.

Operational guidance

  1. Intentionally exceed 1,024 tokens when stable prefix reuse is expected.
  2. Place stable instructions/examples/tools first, variable user input last.
  3. Prefer allowed_tools over reshuffling tool arrays.
  4. Use prompt_cache_key for request routing by shared prefixes.
  5. Consider Responses API for higher cache utilization.
  6. Treat summarization/compaction as cache-breaking events and plan tradeoffs.