Self-host LLM model

We take data security very seriously. Your code will sit on your premises and go to a model that you control, sitting in your cloud.

To self-host LLM model, you have two options:

  1. Self-host using Amazon AWS

  2. Self-host using Microsoft Azure

Both options are covered in detail below.

Option 1: Self-host using AWS
Option 1: Self-host using AWS
Option 1: Self-host using AWS
Pre-requisites
  1. You will need access to a g5.48Xlarge instance instance. We run Mixtral 8x7b Instruct v0.1, a large open source model.

  2. You will need to request a service quota increase. By going to the below link (make sure to change the region that is applicable to you):

    1. https://<REGION_NAME>.console.aws.amazon.com/servicequotas/home/services/ec2/quotas/L-DB2E81BA

    2. Click on Request increase at account level

    3. Ask for 200 vcpus for Increase quota value field

Alternatively, you can either leverage Bedrock (quicker to get set up), or you can negotiate with AWS to get access to this instance.

Setting up your LLM

Once you have access to the VCPUs, you need to create a new EC2 Instance.

  1. Go to this link (make sure to change the region that is applicable to you): https://<REGION>.console.aws.amazon.com/ec2/home?region=<REGION>#LaunchInstances:

  2. Name the Instance, for example p0-llm

  3. Click on Browse More AMIs

  1. Click on Community AMIs

  2. Search for P0-LLM

  3. Press Select

  4. Select instance type as g5.48xLarge

  5. Generate a key pair (or re-use an old one). Make sure you have access to this key pair.

  1. Check Allow HTTP traffic from the internet

  2. Configure 350 GB of storage space.

  1. Make sure that the Instance is accessible from the EC2 box that runs the p0 service. Launch the instance.

  2. Take note of the public Ipv4 address. You will need to paste this into the p0 product.

  1. It may take up to 25 minutes to boot up the AMI completely. It takes some time for the LLM to stand up as it is quite large. You can check the reachability of the LLM by visiting http://<LLM_IP>/v1/chat/completions. This should result in the below (the “Method not Allowed” occurs because visiting the LLM in the browser has the wrong headers and payload)

Option 2: Self-host using Azure
Option 2: Self-host using Azure
Option 2: Self-host using Azure
Pre-requisites
  1. You will need access to Standard_NC96ads_A100_v4. We run Mixtral 8x7b, a large open source model.

  2. First check to see if you have a quota in your region for the above instance. Go this link. Search for NCADS_A100_v4. In the region of your choice change the limit to 96 vcpus.

  3. Then request the quota increase (you might have to create a support ticket for this as these GPUs are in short supply). If you cannot wait we recommend that you leverage Azure Pay-As-You-Go or Azure Hosted Endpoints, as these have no quotas associated with them.

Setting up your LLM

Once you have access to the VCPUs you will need to set up the LLM.

  1. Go to this link

  2. Create a new Resource Group (or use an existing one)

  3. In Virtual Machine Name type in P0-llm

  4. Set your desired region

  5. Set one availability zone

  6. Select the following:

  1. In Size select the GPU instance you selected above.

  2. Generate a new key pair or use an existing one. You will need to SSH into the GPU to be able to start it.

  1. Configure 350 GB of disk space.

  2. Ensure that the LLM is accessible from the p0 server that you have set up before. If it is not accessible, the Scan will not work.

  3. Press Review + Create.

  4. Wait for the deployment to come up and press Go to resource

  1. Take note of the public IP. You will need this in to use the product.

Contact us

If you are facing any trouble setting up your on-prem application, reach out to us at contact[at]p0[dot]inc

Contact us

If you are facing any trouble setting up your on-prem application, reach out to us at contact[at]p0[dot]inc

Contact us

If you are facing any trouble setting up your on-prem application, reach out to us at contact[at]p0[dot]inc

© 2024 p

0

. All rights reserved.

© 2024 p

0

. All rights reserved.

© 2024 p

0

. All rights reserved.