The AI Operating System

An AI Agent that uses your computer, including the browser, Excel, and PowerPoint, to do tasks

🔷 Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.

Gist: An AI Agent that uses your computer, including the browser, Excel, and PowerPoint, to do tasks.

Who: Shanghai AI Lab plus others

What did they do:

  • Built an agent using a mix of Python code and GPT-4 language model prompts called Friday; that

  • controls a Linux or Mac OS computer,

  • including browser, Excel, and PowerPoint, to perform tasks; and

  • self-improves

How did they do it?

  • Created a set of sequential prompts and code, grouped into agents such as:

    • Planner - decompose user requests into smaller tasks

    • Configurator - middleware to take each task and configure it with data from memory or how-tos from tool repositor before passing to Executor

    • Declarative memory - user profile and history of previous actions

    • Tool repository - tools available

    • Working memory - where the next steps for tasks and previous history are kept

    • Executor - generates executable command

    • Critic - assessing whether a task has been completed successfully or whether iteration is needed

  • GPT-4 was the underlying AI model

Generating python code to set dark mode on an app

What did they find?

  • Friday (their agent framework) outperformed GPT-4 with Plugins on a benchmark for general agents

  • It could perform tasks in both Excel and PowerPoint

Comparison of FRIDAY agent on the GAIA agent benchmark

What are the implications?

  • This is actually a working demonstration of Andrej Karpathy’s proposal for an AI Operating System

  • Ideas have been circulating for a while now

  • These systems will get better

Become a subscriber for daily breakdowns of what’s happening in the AI world:

Join the conversation

or to participate.