# Spec-driven development: The AI engineering workflow at Notion | Ryan Nystrom

Podcast: How I AI
Published: May 11, 2026
Reading time: 19 min
Canonical: https://podbrew.app/briefs/how-i-ai-spec-driven-development-the-ai-engineering-workflow-at-notion-ryan-nyst

Ryan Nystrom, a software engineer at Notion and co-founder of Campsite, shares his insights on the evolving landscape of AI engineering. Ryan, a key builder of Notion AI and its Custom Agents feature, also leads Project Afterburner, a significant initiative to drastically cut Notion's CI time.

The discussion centers on spec-driven development, a transformative AI engineering workflow implemented at Notion. It covers creating custom AI agents, like those that auto-generate daily standup pre-reads from various data sources. Ryan explains Notion’s internal "Boxy" system, which integrates AI agents into development, and the critical role of swift CI in maximizing AI coding agent efficiency. The conversation also explores how to effectively prompt AI to defend its reasoning and why engineering managers benefit from staying hands-on with code.

This approach to AI engineering matters because it fundamentally reshapes how software is built and managed, leading to a new era of development velocity and quality. It demonstrates how AI can streamline workflows, reduce administrative burdens, and prevent burnout by freeing engineers for high-value tasks. By integrating AI agents and prioritizing fast iteration, teams can achieve unprecedented efficiency and foster an environment where technical leaders remain deeply involved and impactful.

## Key takeaways

- Engineers using AI in their workflow report increased job satisfaction, working faster, and finding their roles more enjoyable.

- Notion's Project Afterburner aims to reduce CI time by 75% to accelerate development cycles.

- A custom Notion AI agent can synthesize context from Slack, Notion tasks, GitHub PRs, and past meeting transcripts to auto-generate detailed pre-reads for daily standups.

- Automating standup pre-reads eliminates manual preparation, allowing teams to focus meeting time on problem-solving, decision-making, and discussing next steps.

- Automated, data-rich pre-reads facilitate high-frequency meetings without high overhead, leading to more detailed and collaborative work, especially for remote teams.

- Detailed meeting pre-reads effectively surface critical updates and potential improvements, such as a 13% test improvement from a mock server fix, prompting focused discussions.

- Automating meeting prep and other administrative tasks significantly reduces burnout by freeing up time for hands-on, high-value work.

- Managers experience increased job satisfaction by shifting from tedious information compilation to creative problem-solving, team collaboration, and even coding.

- Notion's "Hot Potato" custom AI agent automates daily standup pre-reads by running at 9 AM and summarizing 24 hours of team activity.

- The agent uses sub-agents to gather data from multiple sources, including Honeycomb MCP metrics, Slack project channels, Notion task databases, and previous meeting transcripts.

- Automating minor, repetitive daily tasks, such as information gathering and reformatting, significantly reduces cognitive load by eliminating constant context shifting.

- Focusing on automating tedious 20-minute daily tasks can prevent burnout and free up mental energy for deeper work, rather than solely aiming for large time savings.

- AI agents can autonomously generate full pull requests, complete with UI verification, and resolve subsequent debugging and merge conflicts, all initiated by natural language prompts in collaboration platforms.

- A robust VM and background agent strategy is critical for large engineering organizations to effectively leverage AI, significantly enhancing developer velocity and streamlining engineering workflows.

- Notion AI shifted to a 'spec-first' development model to combat system prompt bloat and instruction fatigue for its AI agents.

- Engineers use Whisper to dictate feature ideas, which Codex then transforms into structured markdown specifications based on existing documentation patterns.

- A comprehensive, plain English spec allows AI agents like Codex to autonomously implement complex features in a highly efficient, often "one-shot," manner.

- A primary responsibility for engineers is to design and implement robust verification loops for AI agents to ensure their outputs are correct and reliable.

- For AI coding agents, CI speed is paramount; slow CI bottlenecks their tireless work, directly limiting an organization's capacity to ship code.

- Challenging AI outputs with phrases like "You're wrong. Defend your argument" can force the AI to provide robust evidence and pointed reasoning.

## 04:02 - 04:55 Ryan Nystrom Finds AI Energizing for Engineering and Team Management

Ryan Nystrom, an engineering manager who leads a team of six or seven people and also writes code, shares his perspective on the rapid changes brought by AI. He admits to feeling a bit overwhelmed by the pace of change but ultimately finds it incredibly energizing for his work.

Claire Vo notes that Nystrom's experience aligns with other guests on the show, who consistently report having more fun, working faster, and experiencing a complete transformation in their workflows due to AI. This widespread impact underscores a significant shift in the industry.

The changes aren't limited to coding tools or how individual code is written. Vo highlights that AI is fundamentally altering how teams are run, specifically mentioning Nystrom's use of Notion AI for team management. This indicates a broader application of AI in leadership and operational roles within engineering.

Nystrom's dual role as a manager and coder provides a unique perspective, demonstrating how AI impacts both the strategic oversight and the hands-on development aspects of engineering, making the entire process faster and more enjoyable for him.

> Kinda freaked out about like, everything's changing, the pace is up, but like, it's, it's really energizing for me.

## 04:55 - 06:37 Notion Launches Project Afterburner to Drastically Reduce CI Time

Ryan Nystrom introduced Project Afterburner, an internal Notion initiative designed to significantly cut Continuous Integration (CI) times. Nystrom observed that Notion's CI was "in between" and "slower than we need to be" compared to his experiences at other companies.

Nystrom was tapped to lead this effort because of his outspoken views on DevX CI and his team's reputation for being "very fast and very AI pilled." The project set an aggressive target: to reduce Notion's CI time to just one-quarter of its current duration.

Project Afterburner is managed through a central Notion hub. This hub houses comprehensive documentation, databases, meeting records, and includes automation that tracks incremental successes, such as seconds saved from individual CI jobs.

> We had this really aggressive goal to cut our CI into like a quarter of what it is.

## 06:37 - 10:01 Automating Daily Standups with a Notion AI Custom Agent

Notion leverages a custom AI agent to transform daily standups from tedious status updates into productive discussions. The agent automatically generates a detailed pre-read for each meeting, eliminating the need for manual preparation. This allows the team to focus on critical problems, decisions, and next steps during the standup.

The custom AI agent gathers context from various sources. It scans Slack conversations from the last 24 hours, closed tasks in Notion, merged GitHub pull requests, and even yesterday's meeting transcript. It then compiles this information into a comprehensive pre-read that includes metrics, recent decisions, project progress, identified bugs, feedback, and open questions.

This automated system means team members can work right up until the start of the meeting without having to prepare individual updates. When everyone joins the video call, the agenda and relevant context are already laid out on screen, enabling a high-bandwidth discussion. This approach addresses the common issue of unproductive standups, ensuring that meetings are a valuable use of everyone's time by shifting focus from 'what I did' to 'what we need to discuss and decide'.

> I can basically work up until like the minute of our meeting. Without having done a bunch of like prep, and then we all get on a video call and we look at the screen and we're like, 'Okay, here's what we need to talk about,' and we'll like hit each bullet.

## 10:01 - 12:01 High-Quality, High-Frequency Automated Meetings Enhance Collaboration

Unproductive meetings where attendees have "glazed-over eyes" and are not paying attention are a significant problem, often leading to the common sentiment that "this could have been an email." Such meetings fail their core purpose: to exchange ideas and information, which can be dangerous for project success.

A more effective approach involves high-frequency meetings enabled by automated, detailed pre-reads. These pre-reads act as conversation starters, reducing overhead while allowing for deeper, more collaborative work, particularly beneficial for remote teams where not everyone is in the same room.

For example, reviewing these detailed pre-reads might reveal that someone fixed a mock server environment in Jest tests, leading to a 13% test improvement. Such discoveries, which might otherwise be missed, prompt further discussion and exploration into additional optimizations.

This method also democratizes information sharing. It provides a platform for brilliant but quieter engineers to share their insights, ensuring their valuable contributions are heard alongside those who might naturally dominate a 30-minute discussion.

> I've been in way too many meetings where I can tell everybody's eyes are glazed over, nobody's paying attention.

## 12:01 - 15:39 Automating Meeting Prep Reduces Burnout and Increases Job Satisfaction

AI can significantly streamline information gathering, leveling the playing field for team members regardless of their communication styles or comfort levels. One host shares how using AI as a communication proxy helps manage social anxiety, ensuring all necessary information is collected efficiently without personal communication barriers.

Automating routine tasks, particularly meeting preparation, directly combats burnout by eliminating the constant cycle of prepping, waiting, and attending meetings. This allows engineers and managers to engage in productive, hands-on work right up until a meeting starts, preventing the common stress of feeling perpetually behind.

Managers, in particular, benefit from this shift, moving away from time-consuming information compilation and report writing. This creates more opportunities for creative problem-solving, direct team support, and even contributing to code, which is often more enjoyable and valuable than administrative tasks.

By removing tedious "paperwork," the workflow becomes more engaging and fun. This allows managers to focus on higher-value activities like collaborating with people, solving hard problems, and building, ultimately leading to greater job satisfaction and a more productive work environment.

> If I'm not in a meeting, I'm prepping waiting for a meeting and then I'm in a meeting.

## 15:39 - 16:20 Engineering Leaders Should Maintain Hands-On Coding Skills

Ryan Nystrom advocates for engineering leaders, from managers to CTOs, to remain hands-on with coding. He believes it is easy and beneficial for them to contribute by fixing bugs or making optimizations.

The current environment is characterized as the "era of the hard skill," where practical coding abilities are paramount. This includes writing code, developing automations, learning new tools, and understanding various AI models.

This perspective shifts focus away from solely developing soft skills or stakeholder management. Instead, it emphasizes the importance of direct technical contribution and continuous learning to stay relevant and effective as a leader.

> This is the era of the hard skill. This is not, 'How do I get better at my soft skills and managing stakeholder?' This is literally like, 'How do you write code? How do you write automations? How do you learn these new tools? How do you understand what models do, do what for your own skills?'

## 16:32 - 20:01 Configuring Notion's Hot Potato AI Agent for Daily Standup Automation

Ryan details the "Hot Potato" custom AI agent within Notion, which is engineered to automate daily standup pre-reads. This agent is set to run every day at 9 AM, focusing specifically on activity that occurred within the last 24 hours. Its main job is to collect relevant information and distill it into a short, engaging summary for the team.

The Hot Potato agent utilizes Notion AI's sub-agent capability, even though it can be expensive and sometimes finicky. It fans out to gather data from several distinct sources, including pulling the latest metrics from Honeycomb MCP, scanning the project's Slack channel for updates, feedback, and questions, and extracting relevant tasks from the team's Notion task database. The agent also incorporates key insights from yesterday's meeting transcripts.

Once the data is collected, the agent formats the information using a specific template that covers CI speed, decisions, progress, changes, bugs, questions, and risks. The resulting output is a concise pre-read, delivered as a link in Slack, intended to be brief and fun, often quirky or even a bit corny. The agent has viewing access to most critical databases but is granted specific editing permissions only for the meetings database, where it updates the page.

> I am explicitly telling it to use sub-agents, which is kind of a sleeper feature in Notion AI. It's very expensive, and it can be kind of finicky sometimes.

## 20:01 - 24:02 Automating Small Tasks to Protect Cognitive Load and Boost Productivity

Many engineers spend considerable time on tedious, repetitive tasks like gathering information from various sources such as Slack, Honeycomb, and GitHub, then reformatting it for different audiences. This constant context switching, even for tasks that only take 20 minutes a day, significantly drains cognitive load and can feel 'soul-sucking.'

AI automation can eliminate this drudgery. Users can describe their desired workflow in natural language, even using screenshots, and the AI can quickly iterate and adapt. The focus isn't necessarily on saving several hours, but rather on removing frequent, minor interruptions that collectively impact mental energy.

By automating these small yet frequent tasks, engineers can protect their cognitive load, allowing them to focus on higher-value, deeper work without constant mental shifts. This approach reduces burnout, makes work feel more relaxed, and provides a 'win-win-win' situation: it's more relaxing, more fun, and leads to greater productivity, unlike the typical 'pick two' dilemma.

> It's weird to have like this like win-win-win. You know, they do the triangle, and they're like, 'Pick two,' and you're like, 'No, I'm gonna pick all three.'

## 25:50 - 27:25 Notion Introduces Boxy, An Internal AI Coding and Agent System

Notion has developed an internal system called Boxy, also known as Software Factory, which integrates AI coding and agent capabilities directly into its engineering workflows. This system addresses the common developer challenge of structuring and managing prompts for AI tools by allowing engineers to write detailed prompts within Notion pages.

Boxy operates using virtual machines, each equipped with tools like OpenAI's Codex and other cloud code. These specialized AI agents can be invoked directly from tasks created within Notion. This setup allows engineers to seamlessly leverage AI for code generation and development assistance without leaving their work environment.

For example, when a user requested a new feature—such as the ability to copy a link to a specific tab block—an engineer can create a task in Notion, add notes and screenshots, and then utilize Boxy to generate the necessary code or automate parts of the development process. This integration streamlines the development cycle, making it easier to implement new features directly from user feedback.

> we built this thing that, we're kind of like calling it, I think we're calling it both software factory, but I like its internal project name is, Boxy, 'cause it's like all these little VMs that we install codecs and cloud code on.

## 27:35 - 32:02 Boxy AI Agent Generates and Defends Automated Pull Request

Ryan demonstrated Boxy, their internal AI agent, automating code changes. He tasked Boxy with fixing a UI bug, including adding a copy link button and correcting a hover state, by simply mentioning "codecs" within a Notion comment.

Within approximately ten minutes, Boxy generated a complete pull request, providing a preview URL and crucial UI verification screenshots. When a CI failure occurred, Boxy not only explained its reasoning for the initial change but also resolved the type check issues and handled a merge conflict.

This interaction highlights a new paradigm for code review, where developers can be candid, stating "I don't get it" without social friction, prompting the AI to provide detailed explanations. This shift underscores the importance of a background agent strategy for integrating AI into engineering, improving velocity and developer experience by offloading environment setup and local machine tasks.

> I literally don't know what I'm doing here. You need to explain this to me.

## 32:10 - 34:03 Notion AI Adopts Spec-First Development for Agents

Notion AI faced challenges with tool and instruction fatigue, resulting in bloated system prompts for their AI agents. To address this, they adopted a 'spec-first' development approach, drawing inspiration from the concept of skills and progressive disclosure used in coding agents. This new methodology aims to streamline the development of AI features.

The workflow involves engineers starting with a markdown document to define new AI agent behaviors. For example, when building an 'Ask Mode' feature that restricts the AI to only reading and answering questions, an engineer would begin by orally dictating the feature's requirements into Whisper.

This audio recording is then fed to Codex, which, by learning from Notion's existing spec library, formats the information into a structured markdown specification. After some revisions by the engineer, this markdown document is checked into a dedicated 'agent specs' subfolder within their codebase. This method prioritizes clear documentation and design before any code is written.

> Let's not start with code, like, let's just start with specs.

## 34:03 - 36:27 Comprehensive specs act as the living source of truth and changelog for AI-driven development

A comprehensive plain English specification serves as the blueprint for AI agents to autonomously build features. For example, by pointing Codex at such a spec file, it can "one-shot" the implementation of a Notion AI feature, generating thousands of lines of code in just a few hours.

The detailed spec includes code pointers and verification instructions, allowing for automated testing and validation using CLI tools. After initial code review, the AI-generated implementation is often largely complete and functional.

This spec, managed under version control, becomes the definitive source of truth and a complete changelog for how a specific part of Notion AI functions. Any future updates are made by modifying the spec, not the code directly, reinforcing its central role.

Beyond engineering, the plain English nature of the spec makes it a valuable asset for other business units, such as marketing, who can use it to understand and communicate about new features in a way that raw code cannot.

> the spec is the source of truth, the spec as the, the, the change log, I think, is a really interesting Model

## 36:27 - 38:03 Engineering roles are evolving towards systems thinking, architecture, and defining AI agent verification.

Modern technical specifications are detailed and include code, but not all of it. This hybrid model allows experienced engineers to focus on architecture and design while AI agents handle more coding tasks.

The role of engineers is shifting from primarily writing code to becoming systems thinkers and architects. This involves not only designing behaviors and writing specifications but also, crucially, defining robust verification loops.

A key responsibility for engineers working with AI agents is to determine how these agents should verify the correctness of their work. If the verification process is unclear, that's the first problem to address.

A practical step involves building tools like a command-line interface (CLI) to allow an AI agent, such as Codex, to run prompts and observe outcomes. This enables engineers to use detailed specs to guide the agent without manually handling plumbing work.

> I view our job as like engineers evolving into like systems thinkers and architects, and not, and not even just necessarily writing like the spec and thinking about the behaviors, but most importantly is like the verification loop, like, is it a-- like, how should it verify correctness?

## 38:03 - 40:00 AI Eliminates Development Bottlenecks by Prioritizing Live Code Over Document Debates

Traditional software development often involves extensive time spent on design documents and spec debates during meetings, which delays the actual coding process. This traditional model adds significant overhead before any code is even written.

AI-driven development fundamentally shifts this paradigm by enabling immediate implementation and verification of code. Instead of waiting for reviews and meetings, teams can ship code and engage in a rapid verification loop.

This change means that debates occur based on the merits of live, working software rather than theoretical discussions from documents. Human attention is redirected from pre-coding planning to evaluating and refining functional applications.

Practical applications of this shift include leveraging AI to prepare for meetings by integrating with tools like Slack and GitHub for better stand-ups, using AI as background agents for tasks such as PRs, and employing AI to autonomously generate code based on repository specifications as the source of truth, then updating specs rather than code.

> No more waiting for the meeting, no more waiting for review, ship it. Have a verification loop. Debate it on the merits of it being live and working versus the theoretical merits of it sitting in a document waiting for everybody's calendar to open up for a live argument.

## 40:03 - 43:00 Ryan Nystrom explains his preference for Codex

Ryan Nystrom chose Codex over Cloud Code because of its superior ability to maintain context during long tasks. While Cloud Code tended to lose focus once its context window was full, Codex could reliably operate for hours without issue.

His workflow benefits from tools that support 'one-shotting solutions,' allowing him to initiate multiple agent tasks simultaneously and manage them in a round-robin fashion. This approach frees him to attend meetings or work on other tasks, rather than being tethered to a single iterative process.

Nystrom also values Codex for its simplicity, lack of unnecessary features, and its compatibility with GPT-5.4. These characteristics align well with his personal working style and the specific types of development work he undertakes.

Claire Vo adds that Codex excels at long-running projects and its project concept is helpful for managing multiple distinct initiatives. She also highlights its strong capabilities as a 'tireless' and 'uncomplaining' code and security reviewer.

> The more, the closer I can get to one-shotting solutions, the better, 'cause that frees me up to do other stuff.

## 43:10 - 46:04 CI Speed is Critical for AI Coding Agents and Human Developers

Rapid Continuous Integration (CI) is essential for effective human development workflows, allowing engineers to quickly push changes, gather feedback, and iterate. A fast CI loop, enabling daily or even more frequent deployments, reduces the risk of large, complex changes and allows for continuous learning and improvement based on live usage.

The importance of fast CI is dramatically amplified with the introduction of AI coding agents. Unlike humans, agents work tirelessly, around the clock. If a CI pipeline takes an hour, an AI agent will spend that hour idly waiting for results, severely bottlenecking its potential output. Conversely, a CI run that completes in three minutes allows an agent swarm to process significantly more work.

Organizations leveraging AI agents, such as Stripe, which processes 1300 agent PRs per week, simply cannot afford slow CI. A sluggish CI pipeline imposes a direct mathematical limit on an organization's capacity to ship code to production, preventing them from realizing the full benefits of AI-driven development.

Engineering leaders must prioritize investments in CI pipeline speed. This not only maximizes the efficiency and output of AI agents but also significantly improves the developer experience for human engineers, who no longer have to endure frustrating waits for their changes to deploy. Fast CI is a win-win for both human and AI productivity.

> there is just a true mathematical limit on your capacity to ship code to production that is a reflection of how fast your CI pipeline is.

## 46:04 - 46:49 Prompting AI to Defend Its Reasoning

The speaker employs a unique prompting strategy: challenging AI outputs directly by pushing back on its initial suggestions. This involves asking the AI to provide evidence and robust reasoning for its recommendations.

This method is particularly effective when the user isn't an expert in the task at hand, such as configuring CI/CD pipelines. In these cases, the specifics can be highly nuanced, and ensuring correctness is paramount.

By compelling the AI to defend its arguments, the user can verify the reliability of its responses, moving beyond simple affirmations to get well-supported justifications. This approach has proven super helpful for getting complex tasks right.

> If I push counter to what it has done, that it can like back up with like good, pointed reasons.

---

Get podcast briefs for shows you follow: https://podbrew.app/