Page Canary Design Doc

This document covers the engineering design of Page Canary

We will the gpt 3.5 LLM model for cost reasons

Bot algorithm

Loop

The core bot loop can be classified into the following category:

State detection

Bot reads the state of the world
- Native apis
- Bot
  - OCR
  - Computer vision

Planning

Bot figures out what it should do:

Execution

Bot executes on it's plan

Page Canary Scaling

Free trial

Free trial hosts run on the main host

Overview

Each paid instance is it's own VM
When a premium user signs up we provision a VM
Users are given a dashboard to control the VM
Users can spin down the VM

Pricing

The pricing from AWS is $6 / VM

If the users billing fails, we spin down their host VM and send them an email

Provisioning

Call VM host api
Store in database
Connect to user owner

To provision a new VM we need to call the API to create a new $6 VM for the user

Installation

SSH to VM
Download installation script
Run installation script

VM selection

It was determined that the t4g.small instance

2 GB RAM

AI

How do we use artificial intelligence at Page Canary?

Large language models (LLMs)

Issue prioritization
Page content issue detection
Duplicate issue detection
Dynamic QA test plans

Computer vision

Object detection

Open CV object detection is used

Helpful shell scripts

Determine jpeg quality

identify -format '%Q' path-to-image.jpg

Page Canary Lang Chain

https://www.langchain.com/ (opens in a new tab) https://js.langchain.com/docs/modules/agents/ (opens in a new tab) https://minimaxir.com/2023/07/langchain-problem/ (opens in a new tab)

Product requirements document Roadmap