2026-02-20

📰 Daily Digest — 2026-02-20

6 items | Business, AI, DevTools

📋 Quick Summary

Jack Altman’s 5 Lessons on Finding Product-Market Fit

Source: a16z speedrun (Substack) · Category: Business · Link: Original

Jack Altman shared five seed-stage lessons in an a16z fireside chat.
If PMF exists, traction appears quickly; repeated “just one more feature” requests often signal weak PMF.
Early hiring should focus on “diamonds in the rough” rather than obvious, fully priced talent.

Cursor’s Self-Learning Coding Agents

Source: Cursor Blog · Category: AI · Link: Original

Cursor introduced a “Dynamic Context Discovery” pattern to reduce repeated agent mistakes.
Converting tool outputs into files and syncing MCP tools by folders reduced token use by 46.9%.
Agent skills are defined as files and discovered dynamically with grep and semantic search.

Lessons from Building Claude Code: Prompt Caching Is Everything

Source: Anthropic Engineering · Category: AI · Link: Original

Anthropic explains why prompt caching is core to production-grade agents.
Reusing prior round-trip computation dramatically lowers latency and cost.
Cache-friendly design requires ordering static before dynamic prompt parts and keeping model/tool sets stable.

Andrej Karpathy: The Future of Customized Software

Source: X (Twitter) · Category: AI · Link: Original

Karpathy reverse-engineered a treadmill API and built a personalized dashboard with an LLM agent in about one hour.
He argues that app-store style discrete app selection is becoming outdated.
Progress is constrained because most products still do not expose AI-native CLI/API interfaces.

Prompt Auto-Caching with Claude

Source: LangChain Blog · Category: DevTools · Link: Original

Claude API now supports automatic caching, pricing cached tokens at about 10% of uncached cost.
A single cache_control parameter replaces complex manual breakpoint management.
Combined with cache-friendly prompts, this can materially improve unit economics for agents.

OpenAI Prompt Caching 201

Source: OpenAI Cookbook · Category: DevTools · Link: Original

OpenAI describes strategies that can reduce TTFL by up to 80% and input cost by up to 90%.
Cache hits require exact prefix matching for at least 1,024 tokens, matched in 128-token blocks.
Responses API can achieve 40-80% better cache utilization than Chat Completions.

📝 Detailed Notes

1. Jack Altman’s 5 Lessons on Finding Product-Market Fit

Jack Altman (Lattice co-founder, Alt Capital) shared practical PMF lessons for seed-stage teams.

Five lessons

Hire early “diamonds in the rough”
- Elite candidates often have many safer options and may not join very early startups.
- Look for high-upside people the market has not fully priced in yet.
Real PMF shows up quickly
- Lattice’s review product was their third pivot.
- Paying customers appeared before the product was fully finished.
- Endless “one more feature” requests usually indicate solution mismatch, not PMF.
Listen to customers without losing your north star
- Large near-term deals can pull the roadmap off strategy.
- Out-of-scope asks often cascade into more misaligned asks.
Operate proactively, not reactively
- Many founders spend most of their time reacting to external requests.
- Past ~20 employees, the CEO role shifts toward building the system that builds products.
No universal startup advice exists
- Advice optimized for the broadest audience often drops context.
- Follow fundraising structure, but protect timeline compression and resist pressure.

2. Cursor’s Self-Learning Coding Agents

Cursor’s Jediah Katz introduced “Dynamic Context Discovery” to make coding agents more efficient and less error-prone.

Five implementation techniques

Convert tool outputs into files
- Save long shell/MCP outputs as files instead of truncating them in chat.
- Agents can retrieve only needed slices (tail, targeted reads) without data loss.
Use chat history as retrievable reference
- When context windows require summarization, keep prior conversations as accessible files.
- This helps recover details lost in compression.
File-defined agent skill system
- Skills are defined as files in an open format.
- Agents dynamically discover relevant skills via grep and semantic search.
Optimize MCP loading
- Replace static inclusion of all tool descriptions with server-folder synchronization.
- Reported total token usage drop: 46.9% on MCP calls.
Sync terminal output to files
- Mirror integrated terminal sessions to local files.
- Enables queryable execution state without manual copy/paste.

Design principle

Treat files as the simplest scalable primitive for LLM tooling.

3. Lessons from Building Claude Code: Prompt Caching Is Everything

Anthropic describes prompt caching as the core production primitive for long-running agents.

Core mechanics

Reuse computation from prior round-trips to reduce both latency and cost.
Agent systems make many sequential calls, so stable system prompts/tools/history deliver large cache reuse.

Cache-friendly design rules

Put static content first (system prompt, tool schema), dynamic user content later.
Avoid model switching mid-session (switches invalidate cache).
Keep tool sets stable in membership and order.
Use cache-preserving forking/compaction patterns.

Production implication

This strategy is presented as one of Claude Code’s key leverage points and reflected in the Claude Agent SDK.

4. Andrej Karpathy: The Future of Customized Software

Karpathy shares a concrete build log for highly customized AI-generated software.

Example build

Goal: 8-week resting heart-rate experiment (50 -> 45 BPM).
In about one hour, he used Claude to reverse-engineer Woodway treadmill APIs and build a dashboard.
He still needed to catch and correct unit/date bugs manually.
Total code size was around 300 lines; he estimates this used to take ~10 hours.

Two claims

App-store model is aging out
- Finding/installing niche apps for every tiny need becomes less natural.
- Agents generating custom apps on demand becomes the default path.
Need AI-native sensors/actuators
- Devices should expose machine-usable interfaces (API/CLI), not human-only frontends.
- Today, most products still expose HTML/CSS docs rather than agent-native interfaces.

5. Prompt Auto-Caching with Claude

LangChain highlights Claude’s new auto-caching behavior.

Key points

Cached token pricing is approximately 10% of uncached tokens.
Developers can rely on cache_control instead of manually managing breakpoints.
When paired with stable prompt structure and tool consistency, agent cost efficiency improves meaningfully.

6. OpenAI Prompt Caching 201

OpenAI’s cookbook covers advanced prompt caching tactics.

Cache hit requirements

Exact prefix match across requests.
At least 1,024 tokens before auto-caching can apply.
Matching is block-based (128-token increments until first mismatch).
Messages, images, audio, tools, and schemas all affect cache identity.

Discount examples (as cited)

GPT-4o: 50% cached-token discount.
gpt-4.1: 75%.
gpt-5-nano: 90%.
gpt-realtime (audio): 98.75%.

Operational guidance

Intentionally exceed 1,024 tokens when stable prefix reuse is expected.
Place stable instructions/examples/tools first, variable user input last.
Prefer allowed_tools over reshuffling tool arrays.
Use prompt_cache_key for request routing by shared prefixes.
Consider Responses API for higher cache utilization.
Treat summarization/compaction as cache-breaking events and plan tradeoffs.

📰 Daily Digest — 2026-02-20

6건 정리 | Business, AI, DevTools

📋 간단 요약

Jack Altman’s 5 Lessons on Finding Product-Market Fit

출처: a16z speedrun (Substack) · 카테고리: Business · 링크: 원문

Lattice 공동창업자 Jack Altman이 a16z 파이어사이드 챗에서 시드 스테이지 핵심 교훈 5가지를 공유했다
PMF가 있으면 트랙션은 빠르게 나타나며, “기능 하나만 더” 요청이 반복되면 잘못된 솔루션 신호이다
초기 10명 채용은 시장에서 아직 인정받지 못한 잠재력 있는 인재(“Diamonds in the Rough”)를 찾아야 한다

Cursor의 Self-Learning Coding Agents

출처: Cursor Blog · 카테고리: AI · 링크: 원문

Cursor가 코딩 에이전트의 실수 반복 문제를 해결하기 위해 “Dynamic Context Discovery” 패턴을 도입했다
도구 응답을 파일로 변환하고, MCP 도구를 폴더 기반으로 동기화하여 에이전트 토큰 사용량을 46.9% 절감했다
에이전트 스킬을 파일로 정의하고 grep·시맨틱 검색으로 동적 발견하는 오픈 표준을 구현했다

Lessons from Building Claude Code: Prompt Caching Is Everything

출처: Anthropic Engineering · 카테고리: AI · 링크: 원문

Anthropic이 Claude Code 구축 경험에서 프롬프트 캐싱이 에이전트 제품의 핵심임을 설명했다
이전 라운드트립의 계산을 재사용하여 지연시간과 비용을 대폭 절감하는 방법을 상세히 다뤘다
최적 프롬프트 순서(정적→동적), 모델 전환 회피, 일관된 도구 세트 유지 등 캐시 친화적 설계 원칙을 제시했다

Andrej Karpathy: 맞춤형 소프트웨어의 미래

출처: X (Twitter) · 카테고리: AI · 링크: 원문

Karpathy가 LLM 에이전트로 1시간 만에 Woodway 트레드밀 API를 역공학하여 개인 맞춤 심박수 추적 대시보드를 구축했다
“앱스토어에서 이산적인 앱을 고르는 개념 자체가 구식”이며, LLM이 즉석에서 맞춤 앱을 생성하는 시대가 올 것이라고 주장했다
99%의 제품/서비스가 아직 AI 네이티브 CLI/API를 제공하지 않아 이 전환이 느리다고 지적했다

Prompt Auto-Caching with Claude

출처: LangChain Blog · 카테고리: DevTools · 링크: 원문

Claude API에 자동 캐싱이 추가되어 캐시된 토큰 비용이 비캐시 토큰의 10%로 절감된다
기존의 수동 캐시 브레이크포인트 관리 대신 단일 cache_control 파라미터로 간편하게 사용 가능하다
캐시 친화적 프롬프트 설계와 결합하면 에이전트 제품의 비용 효율성이 크게 향상된다

OpenAI Prompt Caching 201

출처: OpenAI Cookbook · 카테고리: DevTools · 링크: 원문

OpenAI의 프롬프트 캐싱으로 TTFL 지연시간 최대 80%, 입력 비용 최대 90% 절감이 가능하다
1024 토큰 이상의 정확한 접두사 매치가 필요하며, 128 토큰 블록 단위로 매칭된다
Responses API가 Chat Completions 대비 40~80% 더 높은 캐시 활용률을 보인다

📝 상세 정리

1. Jack Altman’s 5 Lessons on Finding Product-Market Fit

Lattice 공동창업자이자 Alt Capital 매니징 파트너인 Jack Altman이 a16z speedrun 투자자 Marcus Segal과의 파이어사이드 챗에서 시드 스테이지 스타트업을 위한 핵심 교훈을 공유했다.

5가지 핵심 교훈:

초기 10명은 “Diamonds in the Rough”로 채용하라
- 진정으로 뛰어난 인재는 이미 많은 기회를 갖고 있어 초기 스타트업에 합류하기 어렵다
- 시장에서 아직 인정받지 못했지만 진정한 잠재력을 가진 인재를 찾아야 한다
PMF가 있으면 트랙션은 빠르게 나타난다
- Lattice의 성과 리뷰 제품은 실제로 세 번째 피벗이었다
- 제품 완성 전에 유료 고객이 생겼으며, 공동창업자가 고객이 사용 중인 상태에서 핵심 기능을 코딩했다
- “기능 하나만 더 추가해 주세요”가 반복되면 이는 PMF가 아닌 잘못된 솔루션의 신호이다
고객 의견을 들으면서도 제품 비전을 지켜라
- 연매출 22만 달러일 때 40만 달러짜리 계약의 유혹이 있더라도 전략에서 벗어나는 요구는 연쇄적 요청으로 이어진다
- 명확한 North Star를 유지하면서 그 프레임워크 안에서 피드백을 수용해야 한다
반응적이 아닌 능동적으로 일하라
- 많은 창업자가 시간의 90%를 외부 요청에 반응하는 데 소비한다
- 직원 20명 이후 CEO 역할은 제품 구축에서 “제품을 만드는 시스템 구축”으로 전환된다
보편적인 스타트업 조언은 존재하지 않는다
- 가장 넓은 청중을 위해 최적화된 조언은 특정 맥락을 놓친다
- 펀드레이징은 액셀러레이터 템플릿을 따르되, 압축된 타임라인을 유지하고 투자자 압박에 저항할 것

2. Cursor의 Self-Learning Coding Agents

Cursor의 Jediah Katz가 AI 코딩 에이전트의 정보 관리 방식을 개선하기 위한 “Dynamic Context Discovery” 패턴을 발표했다. 핵심 아이디어는 모든 데이터를 미리 로딩하는 대신 필요할 때 관련 컨텍스트를 동적으로 검색하는 것이다.

5가지 핵심 기법:

도구 응답을 파일로 변환
- 긴 셸이나 MCP 출력을 잘라내는 대신 파일로 저장한다
- 에이전트가 tail 등으로 필요한 정보만 선택적으로 접근하여 데이터 손실을 방지한다
채팅 히스토리를 참조 자료로 활용
- 컨텍스트 제한으로 요약이 필요할 때, 이전 대화를 접근 가능한 파일로 참조한다
- 손실 압축 과정에서 사라질 수 있는 핵심 세부사항을 복구할 수 있다
에이전트 스킬 시스템
- 전문 역량을 파일로 정의하는 오픈 표준을 구현했다
- 에이전트가 grep과 시맨틱 검색 도구로 관련 스킬을 동적으로 발견한다
최적화된 MCP 도구 로딩
- MCP 도구 설명을 정적으로 포함하는 대신 서버별 폴더로 동기화한다
- MCP 도구 호출 시 총 에이전트 토큰을 46.9% 절감했다
터미널 출력을 파일로 동기화
- 통합 터미널 세션을 로컬 파일시스템에 동기화하여 수동 복사-붙여넣기 없이 쿼리 가능하다

설계 철학: 파일을 LLM 도구의 단순하고 확장 가능한 프리미티브로 취급하는 실용적 접근 방식이다.

3. Lessons from Building Claude Code: Prompt Caching Is Everything

Anthropic 엔지니어링 팀이 Claude Code 구축 경험에서 프롬프트 캐싱이 에이전트 제품의 핵심임을 상세히 설명했다. “Cache Rules Everything Around Me”라는 원칙 아래, 캐싱이 어떻게 장기 실행 에이전트를 가능하게 하는지 다뤘다.

프롬프트 캐싱의 핵심 원리:

이전 라운드트립의 계산을 재사용하여 지연시간과 비용을 대폭 절감한다
프로덕션 에이전트는 다수의 순차적 API 호출을 수행하므로, 시스템 프롬프트·도구 정의·대화 히스토리가 호출 간 일정하면 캐싱 효과가 극대화된다

캐시 친화적 설계 원칙:

최적 프롬프트 순서: 정적 콘텐츠(시스템 프롬프트, 도구 정의)를 앞에, 동적 콘텐츠(사용자 메시지)를 뒤에 배치한다
세션 중 모델 전환 회피: 모델을 바꾸면 캐시가 무효화된다
일관된 도구 세트 유지: 도구를 불필요하게 추가/제거하거나 순서를 변경하면 캐시가 깨진다
캐시 안전한 포킹 전략: 컴팩션 같은 작업 시에도 캐시를 보존하는 전략을 사용한다

프로덕션 영향:

Claude Code의 성공 요인 중 하나가 바로 이 캐싱 전략이다
이 경험이 Claude Agent SDK 설계에도 반영되었다

4. Andrej Karpathy: 맞춤형 소프트웨어의 미래

Andrej Karpathy가 LLM 에이전트 시대의 “고도로 맞춤화된 소프트웨어”의 미래에 대한 비전을 X에 공유했다. 자신의 심박수 실험 대시보드 구축 경험을 실례로 들었다.

실험 사례:

안정시 심박수를 50→45로 낮추는 8주 실험을 위해 맞춤 대시보드가 필요했다
Claude로 1시간 만에 Woodway 트레드밀 클라우드 API를 역공학하고, 데이터 처리·필터링·웹 UI 프론트엔드를 구축했다
단위 변환(미터법 vs 야드파운드법)과 캘린더 날짜 매칭 등 버그는 직접 발견하여 수정을 요청해야 했다
전체 코드는 약 300줄이며, 2년 전이었다면 약 10시간이 걸렸을 작업이다

두 가지 핵심 주장:

앱스토어 개념의 구식화
- “심박 실험 추적기” 같은 특화 앱을 앱스토어에서 찾아 설치하는 방식은 구시대적이다
- LLM 에이전트가 즉석에서 사용자 맞춤 앱을 생성하는 것이 자연스러운 방향이다
AI 네이티브 센서 & 액추에이터 생태계
- 트레드밀 같은 기기는 “센서”로서 물리 상태를 디지털 지식으로 변환하는 역할이다
- 인간용 프론트엔드 대신 에이전트가 쉽게 사용할 수 있는 API/CLI를 제공해야 한다
- 현재 99%의 제품/서비스가 아직 AI 네이티브 CLI를 제공하지 않고, HTML/CSS 문서를 유지하고 있다

미래 비전: “8주간 심박 추적 도와줘”라고 말하면 1분 내에 앱이 완성되는 세계. AI가 이미 개인 컨텍스트를 보유하고, 필요한 데이터를 수집하며, 관련 스킬 라이브러리를 참조하고, 모든 미니 앱/자동화를 관리하는 미래.

5. Prompt Auto-Caching with Claude

LangChain의 Lance Martin이 Claude API에 새로 추가된 자동 캐싱 기능을 소개했다.

자동 캐싱의 핵심:

캐시된 토큰 비용이 비캐시 토큰의 10%로 대폭 절감된다
기존에는 수동으로 캐시 브레이크포인트를 관리해야 했으나, 이제 단일 cache_control 파라미터로 간편하게 사용 가능하다
개발자가 캐싱 전략을 세밀하게 제어할 필요 없이 자동으로 최적화된다

캐시 친화적 프롬프트 설계와의 시너지:

Anthropic의 @trq212가 공유한 Claude Code의 프롬프트 캐싱 교훈과 결합하면 에이전트 제품의 비용 효율성이 크게 향상된다
정적 콘텐츠를 프롬프트 앞부분에 배치하고, 도구 세트를 일관되게 유지하는 등의 설계 원칙이 자동 캐싱과 함께 효과를 극대화한다

관련 리소스:

Claude 프롬프트 캐싱 공식 문서
가격 페이지에서 캐시 할인율 확인 가능

6. OpenAI Prompt Caching 201

OpenAI의 Erika Kettleson이 프롬프트 캐싱의 고급 전략을 상세히 다루는 쿡북 가이드를 발표했다.

캐시 히트 요구사항:

요청 간 정확한 접두사 매치 필요
최소 1024 토큰 이상이어야 자동 캐싱 대상
첫 불일치까지 128 토큰 블록 단위로 매칭
메시지, 이미지, 오디오, 도구, 스키마 등 모든 요청 구성요소가 캐시 가능

비용 절감 효과 (모델별 할인율):

GPT-4o: 캐시 토큰 50% 할인
gpt-4.1: 75% 할인
gpt-5-nano: 90% 할인
gpt-realtime (오디오): 98.75% 할인

6가지 전략적 권장사항:

캐싱 자격을 얻기 위해 프롬프트를 1024 토큰 이상으로 확장할 것
안정적 콘텐츠(명령어, 도구, 스키마, 예시)를 앞에, 가변 콘텐츠(사용자 메시지)를 뒤에 배치하여 접두사를 안정화할 것
도구 배열을 수정하는 대신 allowed_tools 파라미터를 사용하여 캐시를 보존할 것
prompt_cache_key를 구현하여 접두사 조합당 약 15 RPM 목표로 라우팅을 최적화할 것
Responses API가 Chat Completions 대비 40~80% 더 높은 캐시 활용률을 보이므로 적절한 API를 선택할 것
요약·압축은 캐시를 깨뜨리므로 컨텍스트 보존과 효율성 간 의도적 균형이 필요하다

모니터링: 응답의 cached_tokens 필드를 추적하며, 스키마 변경·명령어 수정·추론 노력 조정·도구 순서 변경이 주요 캐시 무효화 요인이다.