Triton Deployment at Coffea-Casa facility
This environment provides access to a Triton inference server deployment with 2 GPUs. Credentials for the Triton model repository are mounted in the session and are available through environment variables.
Available Deployment
Triton server accessible at
triton-grpc.cmsaf-prod.flatiron.hollandhpc.org2 GPUs available for inference
Model repository backed by S3-compatible storage
Credentials mounted in the session via
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,TRITON_BUCKET_HOST, andTRITON_BUCKET_NAME
Uploading Models
Use the following example to upload a model directory to the mounted Triton model repository bucket:
import boto3
import os
from pathlib import Path
s3 = boto3.client(
"s3",
endpoint_url=f"http://{os.environ['TRITON_BUCKET_HOST']}",
aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
)
def upload_directory(local_path: str, bucket: str, s3_prefix: str = ""):
local = Path(local_path)
for file_path in local.rglob("*"):
if file_path.is_file():
s3_key = f"{s3_prefix}/{file_path.relative_to(local)}".lstrip("/")
print(f"Uploading {file_path} -> s3://{bucket}/{s3_key}")
s3.upload_file(str(file_path), bucket, s3_key)
upload_directory(
local_path="reconstruction_bdt_xgb",
bucket=os.environ["TRITON_BUCKET_NAME"],
s3_prefix="reconstruction_bdt_xgb",
)
Accessing Triton from Coffea-Casa
Connect to Triton from coffea-casa using the GRPC endpoint:
triton-grpc.cmsaf-prod.flatiron.hollandhpc.org
If a port is required, use the standard Triton gRPC port, for example:
triton-grpc.cmsaf-prod.flatiron.hollandhpc.org:8001