Martin Pecha

Oct 12, 2025

5 min

OpenAI’s AgentKit makes building agents easy, but enterprise value depends on production: deep integrations, reliability, governance, and scale to turn pilots into real ROI.

TL;DR: Visual builders like AgentKit lower the barrier to build, not to operate. Enterprise value shows up when automations survive system changes, scale across teams, and pass audits. Visual canvases still demand developer skills, cover the easy connectors, and crumble as tasks grow longer and cross more systems—great for demos, frail in enterprise. Winners flip the model: let business owners describe workflows in natural language, keep IT in the approval loop, orchestrate SAP/Salesforce/email/no-API internal systems, and ship with monitoring, audit trails, and self-healing. Do that and you get production-grade reliability, team-scale reuse, and fast ROI.

Where Visual Builders Stop and Production Begins

Last week, OpenAI launched AgentKit at their Dev Day event—a toolkit featuring a "visual canvas for composing logic with drag-and-drop nodes" to build AI agents (OpenAI, 2025). The AI community responded in minutes, some predicting it could wipe out leading startups, while others pointed to a familiar gap between slick demos and end-to-end, auditable automation in real enterprise environments.

Ethan Mollick, Wharton professor and one of TIME Magazine's Most Influential People in Artificial Intelligence, put it bluntly on LinkedIn: "I really didn't think command line & node-based interfaces would be the future of user experiences. And I hope they don't continue to be, seems like a real failure of imagination compared to what is possible."

His critique exposes a paradox at the heart of enterprise automation: We're making it easier to create agents while ignoring the complexity of running them in production.

Visual builders feel like progress. Drag-and-drop interfaces promise accessibility. But for enterprise operations teams facing real-world automation challenges—supplier onboarding across SAP and Salesforce, multi-system data integration, cross-functional workflow orchestration—the canvas isn't the bottleneck.

Production is.

The Accessibility Illusion: Who Is AgentKit Actually For?

OpenAI's AgentKit announcement reveals the gap between "democratizing AI" marketing and operational reality.

What AgentKit Actually Provides:

Visual canvas with drag-and-drop nodes for workflow logic
Pre-built connectors to Dropbox, Google Drive, SharePoint, Microsoft Teams
Evaluation tools for measuring agent performance
Embeddable chat interface
Available in beta, with pricing included in standard API pricing

What It Requires:

Developer mindset to design node-based workflows
Technical knowledge to configure guardrails
API integration skills for custom connections
Understanding of agent architecture and evaluation frameworks

The tool positions itself as accessible ("no-code"), but the reality is more nuanced. Industry observers note that AgentKit remains a visual workflow builder for developers, not a solution for non-technical business users.

The gap is significant: while AgentKit expands who can create agents, it still requires technical expertise and remains too complex for team-scale collaboration where business users share workflows and use cases across departments.

The core issue: Still too technical for enterprise operations teams.

The Pattern We've Seen Before

This isn't new territory for enterprise automation.

Traditional RPA vendors—UiPath, Automation Anywhere, Blue Prism—promoted visual builders a decade ago. The promise: Business users could build automations themselves. The reality: IT teams still implemented everything, projects took 6-12 months, and automations broke every time SAP or Salesforce updated their interfaces.

Even n8n, the workflow automation platform that recently raised a Series C, positions itself with visual node-based workflows. While their open-source approach has gained traction, the fundamental challenge remains: visual builders democratize creation but don't solve production complexity.

Now AI agent platforms are repeating the same pattern: prettier interfaces, same fundamental challenges.

The Three Gaps Visual Builders Don't Address

Enterprise deployments reveal three critical gaps where visual builder implementations consistently fail—regardless of how accessible the canvas looks.

Gap 1: Integration Hell

The Problem:

Pre-built connectors handle a minority of real-world automation scenarios. AgentKit includes connections to Dropbox, Google Drive, SharePoint, and Microsoft Teams (OpenAI, 2025)—useful for document workflows, but nowhere near comprehensive for enterprise operations.

The Reality:

Operations teams need automations that span SAP ERP, Salesforce CRM, supplier portals, email, Excel, and internal databases
Healthcare organizations require HIPAA-compliant data filtering that generic connectors don't provide
Financial services face regulatory reporting requirements across multiple systems that visual builders don't handle
Manufacturing operations need integration with MES, ERP, and quality management systems
Procurement teams face vendor-specific APIs with authentication requirements visual builders don't address

Community comparisons on no-code workflow platforms reveal consistent limitations that general-purpose workflow tools (Zapier/Make) still offer the broadest long-tail app coverage, while node-based builders target deeper internal flows, rather than cross-app automation (Latenode Community, 2025).

Users hit the wall quickly: "Questions remain about how far visual builders can go and where coding becomes necessary."

Example: Supplier Onboarding Automation

A procurement manager at a mid-market enterprise needs to automate supplier onboarding:

Extract data from vendor emails (various formats)
Validate against compliance requirements (GDPR, internal policies)
Create records in SAP vendor master data
Update Salesforce opportunity records
Trigger approval workflows in internal system
Send confirmation emails with next steps

AgentKit's pre-built connectors don't cover this. Building custom integrations requires:

API authentication for multiple systems
Error handling for malformed vendor data
Retry logic for network failures
Data transformation between system formats
Compliance validation rules

The visual canvas makes the workflow visible. It doesn't make the integration challenges disappear.

Gap 2: Production Reliability

The Problem:

Demos work with clean data and happy paths. Production environments have edge cases, concurrent users, system failures, and data quality issues that visual builders don't address.

The Statistics:

Industry data reveals the scale of this challenge. BT Group, running AI agents in production for customer service, reports success rates "nearing 50% across several key client journeys", per a vendor roundup (Master of Code, 2025).

That's a coin flip.

For comparison, mature organizations target deployment success rates above 99.5%. The gap between demo-ready and production-ready is measured in dozens of percentage points of reliability.

What Visual Builders Miss:

Edge case handling (what happens when data is in different languages? Or missing required fields?)
Concurrent execution (can multiple teams run the same automation simultaneously?)
Graceful degradation (if external systems are down, queue the records for later instead of failing)
Data validation (reject malformed input before it corrupts downstream systems)
Audit trails (compliance teams need to see every automation decision across all operations)

Research confirms the pattern: A 2025 study formalized the “50% task-completion time horizon”—the human time a task typically takes when a model succeeds ~50%. Results show near-100% success on tasks under ~4 minutes, dropping to ~10% for tasks over four hours, with the 50% horizon doubling roughly every seven months since 2019. Follow-on work models this as an approximately exponential decline in success with task length.

Complex, multi-step enterprise workflows aren't “short tasks”. They're multi-hour, multi-system orchestrations with failure modes at every step.

The Maintenance Burden:

OpenAI highlights that Ramp (a fintech company) "built a procurement agent in a few hours instead of months" (TechCrunch, 2025).

But what about maintaining it?

When SAP releases a UI update? When vendor email formats change? When compliance requirements shift? Visual builders make initial creation faster—they don't eliminate ongoing maintenance complexity.

Gap 3: Domain Expertise Translation

The Challenge:

Enterprise operations require domain-specific knowledge that AI agents can't provide and visual builders can't encode.

Examples:

Healthcare automation needs clinical decision-making protocols and regulatory compliance knowledge
Financial services require understanding of reconciliation rules, regulatory reporting, and risk management frameworks
Manufacturing operations depend on production scheduling logic, quality control standards, and supply chain coordination
Retail category management needs trade promotion expertise specific to channels and product categories

A category manager knows how trade promotions work in their specific retail channels better than any AI agent or visual builder template ever will.

The traditional automation model assumes IT translates business requirements into technical workflows. But IT teams don't have the process expertise. This creates a translation gap:

Category manager explains promotion tracking process to IT
IT interprets requirements (introducing translation errors)
IT builds automation (taking 6-12 months due to backlog)
Business process has changed by the time automation is ready
Iteration cycle begins again

Visual builders don't solve this. They just move where the translation happens—from IT writing code to business users configuring nodes.

The Fundamental Question:

Should category managers learn node-based workflow design? Or should automation platforms learn to understand business processes described in natural language?

Why Developer-First Tools Miss the Enterprise Mark

The focus on "Dev Day" announcements reveals the gap between what gets built and what enterprises actually need.

AI labs optimize for developer productivity. But the actual source of automation innovation is non-technical domain experts—the operations professionals, process owners, and functional leads who understand which processes need automation and why.

The Three-Part Problem:

Creation Complexity: Visual builders still require technical mindset (node-based thinking, API concepts, workflow logic)
Single-Player Focus: Individual developers build isolated agents without team collaboration or knowledge sharing at scale
Production Blind Spot: Tools optimize for demo creation, not production reliability and governance

Enterprise automation requires the opposite approach:

Natural language creation: Business users describe workflows in their own terms, not technical diagrams
Team-scale deployment: Shared automations across operations, finance, supply chain, procurement, HR teams
Production-first design: Built for reliability, governance, and maintenance from day one

This is where business-first automation platforms like Duvo fundamentally differ from visual builders.

The Business-First Alternative

What actually works for enterprise automation? A fundamentally different architecture.

Natural Language → Working Automation (No Canvas Required)

Instead of teaching operations professionals to think in nodes and workflows, let them describe processes the way they already explain them to colleagues:

Example: Multi-Source Data Consolidation (Retail)

A category manager says:

"Every Monday morning, I need to pull last week's promotional performance data from each of our top 5 retail partners. The data comes in different formats—some send Excel files via email, others have FTP sites, one uses an API. I combine it all into a standardized report, compare actual performance against forecasted volume, flag any promotions that underperformed by more than 15%, and share the analysis with my sales director and the marketing team."

That's the workflow. No nodes. No canvas. Just the process described in business terms—whether it's retail promotion tracking, financial reconciliation, or manufacturing quality reporting.

The Platform's Job:

Understand the natural language description
Identify the systems involved (enterprise applications, email, APIs, databases)
Build the cross-system automation
Route to IT for approval and governance review
Deploy to production with monitoring and error handling

The Result:

The operations professional gets the automation they need without learning visual workflow design. IT maintains governance and oversight without implementing every workflow manually.

Production-Ready by Design

Enterprise automation platforms need to assume production complexity, not treat it as an afterthought.

Self-Healing When Systems Change:

Unlike traditional RPA (which breaks when SAP updates a screen) or visual builders (which require manual workflow updates), business-first automation adapts when systems change.

How It Works:

Semantic understanding of business intent (not screen-scraping or brittle API calls)
API-first integration where available (fallback to UI automation only when necessary)
Continuous learning when system interfaces change
Automatic workflow adjustments with human review for significant changes

Enterprise Error Handling:

Production automation faces failures:

External systems are down
Email attachments are corrupted
API rate limits exceeded
Data formats changed unexpectedly
Network connectivity issues

Visual builders expose these errors to users ("node 47 failed"). Business-first platforms handle them:

Retry logic with exponential backoff
Queue for later processing when external systems are unavailable
Alert operations teams only when human decision is needed
Audit trail of all automated decisions for compliance

The Reliability Gap:

Remember the BT Group statistic: ~50% success rate for AI agents in production.

Enterprise operations can't tolerate 50%. Whether it's supplier onboarding, financial reconciliation, quality control workflows, or demand forecasting—these are business-critical processes.

Production-ready automation requires:

95%+ success rate in normal operations
Graceful degradation during system outages
Clear escalation paths when automation can't proceed
Full audit trails for compliance and troubleshooting

Business User Expertise + IT Governance

The correct automation model isn't:

IT builds everything (traditional approach—too slow)
Business users build everything (AgentKit/visual builder dream—unrealistic)

It's a collaboration model:

Business users create automations (they know the process)
IT approves and governs (they ensure security, compliance, architecture)
Platform handles production complexity (self-healing, error handling, monitoring)

The Approval Workflow:

An operations professional builds a cross-system data consolidation automation (whether it's trade promotion tracking, financial reconciliation, or quality reporting). Before it runs in production:

IT reviews systems being accessed
Compliance reviews data handling
Security reviews authentication methods
Architecture reviews error handling and monitoring

This takes hours or days—not the 6-12 months of traditional IT implementation.

The Governance Advantage:

IT doesn't lose control. They gain visibility:

Full audit trail of every automation
Monitoring dashboards for performance and errors
Approval authority before production deployment
Ability to pause or modify any automation

This collaboration model—business users create, IT governs, platform self-heals—is what separates production-ready automation from visual builder demos.

The Economic Reality: Why Visual Builders Cost More Than Expected

The true cost of visual builder automation emerges after deployment. Let’s look at the illustrative model below.

Visual Builder Path:

Initial investment: €150K total
- Platform license: €50K annually
- Integration development: €80K (4 months × €20K/month developer time)
- Training and setup: €20K
Timeline: 9 months (build, debug, production hardening)
Success rate: ~50-60% (based on production AI agent data)
Maintenance: €30K annually for ongoing debugging when systems change

Business-First Automation Path:

Initial investment: €150K total
- Platform license: €100K annually
- Implementation support: €50K (approximately 2 man-days of forward-deployed engineer time per use case, across 5 initial use cases)
Timeline: ~2 days to first automation (2 man-days of engineer support)
Success rate: 95%+ (production-ready architecture, self-healing)
Maintenance: Included in platform (self-healing architecture eliminates ongoing debugging costs)

The Payback Difference:

Same initial investment. Dramatically different time-to-value and reliability.

For a mid-market enterprise automating supplier onboarding, financial reconciliation, quality reporting, and cross-system data consolidation with Duvo:

250+ hours saved weekly across operations teams
€75K-150K annual ROI per automated use case
4-6 month payback period (not 18-24 months debugging visual workflows)
Zero maintenance overhead from UI changes (self-healing architecture)

What Enterprises Actually Need

The AgentKit launch and the community response reveal what's still missing in enterprise automation:

1. Natural Language Automation (Not Visual Canvases)

Business users shouldn't learn workflow design. Platforms should understand business process descriptions.

2. Production-First Architecture (Not Demo-Optimized Tools)

Self-healing when systems change. Enterprise error handling. Audit trails and compliance. Built-in, not bolted-on.

3. Business User Creation + IT Governance (Not Single-Player Developer Tools)

Scale automation across teams while maintaining enterprise controls.

4. Domain Expertise Encoding (Not Generic Agent Templates)

Operations professionals across finance, supply chain, procurement, HR, and quality management have process knowledge that shouldn't require translation to technical workflows. Platforms like Duvo capture this expertise through natural language descriptions, not node configurations.

5. Team Collaboration & Sharing (Not Individual Agent Building)

Enterprise automation requires team-scale deployment where operations teams share workflows and build institutional knowledge. Visual builders optimize for individual developers; business-first platforms enable team-wide automation capabilities.

The Real Innovation Isn't Visual—It's Architectural

The breakthrough in enterprise automation isn't prettier visual interfaces. It's the complete rethinking of the automation stack:

The Old Model (RPA, Visual Builders, AgentKit):

User (developer or business user learning visual tools) designs workflow
Platform executes workflow as designed
User maintains and debugs when things break

The New Model (Duvo's Business-First Automation):

Business user describes process in natural language
Platform builds production-ready automation with IT approval
Platform self-heals when systems change, escalates only when needed

The Difference:

Duvo doesn't make visual builders prettier. It eliminates the need for visual builders by handling the complexity that visual builders expose. Operations professionals describe what they need in business terms. Duvo handles the cross-system orchestration, error handling, and production reliability automatically.

What This Means for Enterprise Operations

If you're a COO, VP Operations, or IT Director evaluating automation platforms, the AgentKit launch offers a clarifying moment.

Ask These Questions:

1. Who will actually create the automations?

If the answer is "developers using a visual canvas," you're back in the IT bottleneck. Operations professionals across finance, supply chain, procurement, and HR teams won't adopt node-based thinking.

2. What happens when our systems update?

If the answer is "you'll need to update the workflows," you're signing up for perpetual maintenance. Your ERP, CRM, and external vendor portals all change regularly.

3. How do we maintain enterprise governance?

If the answer is "business users build whatever they want," your compliance and security teams will veto the platform. If it's "IT implements everything," you haven't solved the bottleneck.

4. What's the success rate in production?

If the answer is vague or references "demo environments," you're looking at months of debugging. Production reliability isn't a feature—it's the foundation.

5. How do we scale across teams?

If the answer is "individual developers build agents," you're not building institutional automation capabilities. You're creating scattered individual projects.

The Right Platform (What Duvo Provides):

Business users describe processes in natural language—no node-based thinking required
IT approves and governs before production—full visibility and control
Platform handles cross-system complexity automatically—SAP, Salesforce, email, Excel, supplier portals
Self-healing when systems change—no maintenance burden when UIs update
95%+ success rate in production—enterprise-grade reliability, not demo performance
Shared workflows scale across teams—operations, finance, supply chain, procurement, HR teams collaborate

Duvo's architecture addresses every gap visual builders expose. The result: automation that actually works in production, maintained by the platform instead of your team.

Conclusion: Beyond the Hype, Into Production

OpenAI's AgentKit launch represents genuine progress in making AI agent creation more accessible. The visual canvas is better than writing code. Pre-built connectors save development time. Evaluation tools help measure performance.

But accessibility in creation doesn't eliminate complexity in production.

The industry response exposes the gap between what AI labs are building and what enterprises actually need. Visual builders feel like innovation. But for enterprise operations teams facing real-world automation challenges—supplier onboarding across SAP and Salesforce, financial reconciliation from multiple systems, quality reporting with cross-functional data—the canvas isn't the constraint.

Production is.

The real automation breakthrough won't come from prettier interfaces. It comes from platforms that encode business expertise, handle production complexity automatically, and empower non-technical domain experts without eliminating IT governance.

That's not a visual builder problem. It's an architectural problem.

Duvo solves it by moving beyond demos and into production-ready, team-scale, business-first automation. Not by making creation easier. By making production reliable.

For enterprise operations teams across industries, that's the difference between automation that works in a demo and automation that runs your business.

Sources:

OpenAI launch post (Oct 6, 2025) — official details on AgentKit/Agent Builder/Connector Registry/ChatKit/Evals; availability & pricing. OpenAI
TechCrunch (Oct 6, 2025) — developer focus + ChatKit summary. TechCrunch
VentureBeat (Oct 6, 2025) — connectors & registry summary. Venturebeat
ChatKit docs — embeddable chat UI. OpenAI Platform
Mollick post — original quote on LinkedIn; supporting post on X. LinkedIn+1
Ramp example — OpenAI’s case blurb (“blank canvas… in hours”). OpenAI
BT Group ‘~50%’ stat — vendor roundup (treat as secondary). Master of Code Global
Task-duration limits — arXiv study (50% time-horizon metric) + popular summary. arXiv+1
n8n Series C — company blog & independent coverage. n8n Blog+1

Stop waiting.
Start automating.

Join the 500+ enterprises already transforming their operations with DUVO.
Get your personalized automation roadmap in 15 minutes.

Book a demo

Stop waiting.
Start automating.

Join the 500+ enterprises already transforming their operations with DUVO.
Get your personalized automation roadmap in 15 minutes.

Book a demo