Yoonchul Yi
โ† Back to daily insights

2026-03-11

/

๐Ÿ“ฐ Daily Digest โ€” 2026-03-11

7 items | DevTools, Security, Business, Misc, Backend


๐Ÿ“‹ Quick Summary

Run prompts on a schedule

Source: Claude Code Docs ยท Category: DevTools ยท Link: Original

  • Claude Code introduces /loop for recurring prompts and natural-language one-time reminders in a live session.
  • Scheduling supports interval parsing (s/m/h/d), defaults to every 10 minutes, and rounds odd intervals to cron-compatible cadence.
  • Core limits and runtime behavior are explicit: up to 50 tasks per session, low-priority execution between turns, jitter, and 3-day expiry for recurring jobs.

Agent Safehouse

Source: agent-safehouse.dev ยท Category: Security ยท Link: Original

  • Agent Safehouse proposes a deny-first, kernel-enforced sandbox for local coding agents on macOS.
  • The tool ships as a single script and grants scoped access to the working directory while denying sensitive paths like ~/.ssh and ~/.aws.
  • It provides wrappers for multiple agent CLIs so teams can make sandboxed execution the default without changing daily workflows.

Company Graphs = Context Repository

Source: X (Heinrich) ยท Category: Business ยท Link: Original

  • The thread argues that AI quality in knowledge work is primarily a context-structure problem, not a model-capability problem.
  • It proposes a company-wide markdown knowledge graph that links decisions, strategy, meetings, research, and code into traversable context.
  • The key claim is that externalizing tacit knowledge from conversations into structured artifacts enables agents to reason across organizational memory.

How to use AI to generate (fantastic) slides

Source: X (Ruben Hassid) ยท Category: Misc ยท Link: Original

  • The post compares three deck workflows and rates direct Claude slide generation as fastest but visually weak for high-stakes presentations.
  • Gamma is positioned as a design-forward option with rapid generation, sharing analytics, and export support, but output quality depends heavily on prompt specificity.
  • The recommended path combines Claude for research/outline and Gamma for design, then adds human editing and reusable brand rules for consistent outcomes.

How Coding Agents Are Reshaping Engineering, Product and Design

Source: X (Harrison Chase) ยท Category: Business ยท Link: Original

  • The essay says the classic PRD โ†’ mock โ†’ implementation pipeline is collapsing as coding agents make prototyping cheap and fast.
  • It frames review quality as the new bottleneck, requiring stronger architecture judgment, product sense, and design evaluation across teams.
  • Role boundaries blur toward builder/reviewer archetypes, raising the value of generalists while increasing the specialization bar.

How we built LangChainโ€™s GTM Agent

Source: X (LangChain) ยท Category: Business ยท Link: Original

  • LangChain describes a human-in-the-loop GTM agent that automates lead research, personalized draft creation, and account-level intelligence.
  • Reported outcomes include +250% lead-to-qualified-opportunity conversion, 3x pipeline dollars, and 40 reclaimed rep hours per month per person.
  • The system design emphasizes multi-step orchestration, source-grounded explainability, and memory from rep edits to improve draft quality over time.

Your LLM Doesnโ€™t Write Correct Code. It Writes Plausible Code.

Source: X (Hลrลshi ใƒใ‚ฌใƒœใƒณใƒ‰) ยท Category: Backend ยท Link: Original

  • The benchmark shows a 100-row primary-key lookup taking 0.09 ms in SQLite versus 1,815.43 ms in an LLM-generated Rust rewrite (~20,171x slower).
  • The analysis attributes major regressions to planner/rowid path mistakes and autocommit sync overhead, despite code that compiles and passes tests.
  • The broader point is that plausible architecture and passing tests can still hide catastrophic performance bugs without explicit acceptance criteria and benchmarking.

๐Ÿ“ Detailed Notes

1. Run prompts on a schedule

  • The new scheduling workflow centers on /loop, which accepts an interval plus prompt and maps it to cron-backed recurring execution inside a Claude Code session.
  • Intervals can be leading, trailing (every ...), or omitted (default 10 minutes), with second-level requests rounded to cron granularity.
  • Task management is exposed through CronCreate, CronList, and CronDelete, with an 8-character task ID and a 50-task cap per session.
  • Runtime behavior is clearly bounded: scheduler checks run every second, jobs execute between turns, recurring tasks expire after 3 days, and deterministic jitter prevents synchronized spikes.

2. Agent Safehouse

  • Safehouse positions itself as a practical guardrail for --yolo-style agent usage by enforcing filesystem access boundaries at the kernel layer.
  • The default model grants read/write to the project workspace and blocks high-risk areas (SSH keys, cloud credentials, unrelated repos) unless explicitly allowed.
  • Installation is intentionally low-friction (single script, no build pipeline), making it easy to add wrappers for popular agent CLIs.
  • The product thesis is that probabilistic agent failure is inevitable at scale, so deterministic sandbox constraints should be the first line of defense.

3. Company Graphs = Context Repository

  • Heinrichโ€™s core argument is that coding improved first because software already existed as linked, traversable text artifacts (files/imports/modules).
  • He extends that pattern to non-engineering work: organizations need a graph of atomic markdown notes covering decisions, strategy, competitive research, and meeting-derived insights.
  • The thread emphasizes that critical context is usually fragmented across Slack, docs, and peopleโ€™s memory, which makes retrieval and reasoning brittle.
  • A maintained context graph is presented as the infrastructure that lets agents operate on institutional knowledge rather than disconnected snippets.

4. How to use AI to generate (fantastic) slides

  • The post outlines three practical workflows: direct .pptx generation in Claude, direct generation in Gamma, and a combined Claude+Gamma pipeline.
  • Direct Claude is framed as speed-first but design-light, while Gamma is framed as design-strong but dependent on the quality of input brief.
  • The preferred process separates concerns: Claude handles deep research plus structured outline, then Gamma handles visual generation from that outline.
  • A reusable brand operating model is suggested: import an existing brand template into Gamma, generate a markdown brand-rules file in Claude, and standardize team output around both assets.

5. How Coding Agents Are Reshaping Engineering, Product and Design

  • Harrison Chase argues that coding agents shift software creation from implementation scarcity to prototype abundance.
  • With more prototypes flowing in, engineering/product/design teams spend relatively more time on review, arbitration, and quality control.
  • He distinguishes โ€œPRDs are deadโ€ (old waterfall process) from โ€œrequirements are deadโ€ (false): intent documentation remains essential for reviewers.
  • The essay predicts role blending around builders and reviewers, with product sense and systems thinking becoming cross-functional requirements.

6. How we built LangChainโ€™s GTM Agent

  • LangChainโ€™s GTM agent automates inbound/outbound prep by checking contact history, gathering CRM + call + web context, and proposing Slack drafts with rationale.
  • Reported business impact is substantial: +250% lead-to-qualified-opportunity conversion, 3x pipeline growth, and broad time savings across the sales org.
  • The implementation includes strict human approval gates, observability/evaluation hooks, and longitudinal learning from rep edits stored as structured memory.
  • Weekly account-intelligence runs expand scope beyond email drafts, surfacing risk/opportunity signals to help reps prioritize limited attention.

7. Your LLM Doesnโ€™t Write Correct Code. It Writes Plausible Code.

  • The case study benchmarks an LLM-generated Rust SQLite reimplementation and finds extreme slowdowns on basic operations despite functional correctness signals.
  • A highlighted root cause is planner behavior that misses INTEGER PRIMARY KEY fast paths and falls back to full scans for common WHERE id = ? queries.
  • Another key factor is per-statement autocommit overhead (including frequent sync calls), amplified by repeated cloning/allocation/reload patterns.
  • The meta-lesson is methodological: correctness claims require performance benchmarks and explicit acceptance tests, not just compilable code and passing unit tests.