internal
Ai

Page Canary Learning Browser

Hypothesis

  • AI tools can be used to learn how to use a website
  • Learning works by using AI models to detect and click on elements
  • If the workflow was successful, we save that set of actions for reuse.

Premise

  • Web automation is usually a workflow of:
  1. Get the query selectors
  2. Click and type to accomplish workflow

Pivot n34t.com

"We're building an AI powered web browser"

  • Write an integration tests to test the sign in workflow on "example.com"

Example

Given a task "Message reddit user "n34t-bot" hello"

Break it down to:

  1. Navigate to Reddit
  2. Login to Reddit
  3. Message user

System design

UI => Query screening => Task breakdown => Knowledge lookup => Run task => Save results in knowledge db => return result

UI

  • Collect the directive from the user
  • View previous runs

Query screening

Query screening attempts to error out early if the user enters an invalid query

  • Does the bot have enough information to perform the task?
  • Is the task acheiveable by a web browser?
  • Is the task malicous / illegal / etc.

Task breakdown / planning

  • Break down the task into substeps to be completed in order

Knowledge lookup

  • Checks internal knowledge system if we know how to accomplish any of these tasks already

Run tasks

  • Runs tasks to accomplish workflow

Save results

  • Saves the results

Return result