OpenAI has announced the release of Operator, an artificial intelligence (AI) agent capable of navigating the web to perform tasks on behalf of users.
The feature, described as a “research preview,” is initially available to subscribers of OpenAI’s ChatGPT Pro tier, priced at $200 per month, and will launch first in the US, according to a company blog post.
Operator leverages a “Computer-Using Agent” model, combining the vision capabilities of GPT-4o with advanced reasoning through reinforcement learning.
This allows it to interact with graphical user interfaces (GUIs) in a browser by typing, clicking, and scrolling.
“Operator can ‘see’ (through screenshots) and ‘interact’ (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations,” OpenAI explained.
The AI agent also employs reasoning to self-correct errors and seeks user input when it encounters sensitive tasks, such as entering login credentials or approving an action like sending an email. OpenAI emphasized that Operator has been designed to refuse harmful requests and block disallowed content.
OpenAI is collaborating with companies such as DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, and Uber to ensure Operator meets practical, real-world needs while adhering to established norms.
However, the tool is not without limitations. OpenAI warned that Operator may encounter challenges with complex interfaces, such as creating slideshows or managing calendars.
OpenAI aims to expand the availability of Operator to users of its Plus, Team, and Enterprise tiers and eventually integrate its capabilities into ChatGPT.