Setup LLM model for GCP
We take data security very seriously. Your code will sit on your premises and go to a model that you control, sitting in your cloud.
Part 1: Getting access to model
Go to URL: https://console.cloud.google.com/vertex-ai/publishers/anthropic/model-garden/claude-3-haiku
Select the project you want to install your LLM from top-left
Click
Enable
button to request access to this model
Fill out the form with relevant information. You’ll also have to select a billing account as part of this process.
Once the final step is completed on above form, you’ll see a success modal. Now you should have access to
Claude 3 Haiku
model with-in few minutes.
Part 2: Increasing quota & tokens limit of your model
Increasing quota limit
Please note that, depending on your organization contract with Google, you might need to increase the quota for Claude 3 Haiku
model. We expect the Quota = 250
. Please visit this URL to check / request quota increase: https://console.cloud.google.com/iam-admin/quotas
In case you don’t see any "Quotas" in the list, go back to your Model Garden page for Claude and click
Open Notebook
CTA.Click
Enable
on the modal to enableVertex AI API
. Once done, you should see “Enabled” for Vertex AI API. This might take a few seconds to a minute.
Once
Enabled
, you’ll be asked toConfirm
the action. HitConfirm
.
Now go back to the Quota page.
You should be now seeing a few “Services”
From the
Filter
CTA, search:base_model:anthropic-claude-3-haiku
Select the
Service
you want to edit the quota of for the region of your choice by clickingEdit quotas
CTA on top-right. Selecteurope-west4
, ClickEdit Quotas
and set the quota to250
.
Increasing tokens limit
Increasing tokens limit
You'll also need to increase Tokens/min for your model. This has to be requested via support ticket and needs to be set at 1 million tokens/min
for Claude 3 Haiku
for europe-west4
. Follow the steps highlighted below to raise support ticket.
Go to https://console.cloud.google.com/support/createcase/ to create a support ticket
In "Select your product" type and select
Vertex AI Other
In "Describe your issue",
Hitting HTTP 429 error on Vertex API for region: europe-west4 model: anthropic-claude-3-haiku even when usage is below quota limit
Select priority as:
P2 High Imapct
Hit
Next
CTA to describe your issue in the next step
Skip the
Resources
step by hittingNext
CTA
In detailed description step:
Select the product sub-category as
Other
Add your project ID. You'll find it in your project dropdown from top-left of nav
In "Observed error message" add this:
In "Provide more details" input, add this description for a faster workflow. Feel free to add any additional information if you like.
Once done, hit
Submit
to raise support ticket.
Once you get your Tokens/min updated to 1 million tokens/min, move to the next step.
Part 3: Accessing your p0 instance
Once you've modified the quota and token limit, you can now go to your VM instance page and find the Project ID to submit in our product's onboarding. Continue the rest of the onboarding as directed.