internal
Engineering design

Page Canary Design Doc

This document covers the engineering design of Page Canary

  • We will the gpt 3.5 LLM model for cost reasons

Bot algorithm

Loop

The core bot loop can be classified into the following category:

State detection

  • Bot reads the state of the world
    • Native apis
    • Bot
      • OCR
      • Computer vision

Planning

Bot figures out what it should do:

Execution

  • Bot executes on it's plan

Page Canary Scaling

Free trial

Free trial hosts run on the main host

Overview

  • Each paid instance is it's own VM
  • When a premium user signs up we provision a VM
  • Users are given a dashboard to control the VM
  • Users can spin down the VM

Pricing

The pricing from AWS is $6 / VM

  • If the users billing fails, we spin down their host VM and send them an email

Provisioning

  • Call VM host api
  • Store in database
  • Connect to user owner

To provision a new VM we need to call the API to create a new $6 VM for the user

Installation

  • SSH to VM
  • Download installation script
  • Run installation script

VM selection

It was determined that the t4g.small instance

  • 2 GB RAM

AI

How do we use artificial intelligence at Page Canary?

Large language models (LLMs)

  • Issue prioritization
  • Page content issue detection
  • Duplicate issue detection
  • Dynamic QA test plans

Computer vision

Object detection

  • Open CV object detection is used

Recommended reading resources

Helpful shell scripts

Determine jpeg quality

identify -format '%Q' path-to-image.jpg

Page Canary Lang Chain

https://www.langchain.com/ (opens in a new tab) https://js.langchain.com/docs/modules/agents/ (opens in a new tab) https://minimaxir.com/2023/07/langchain-problem/ (opens in a new tab)