llm-api.arc.vt.edu

Description

https://llm-api.arc.vt.edu/api/v1/ provides an OpenAI-compatible API endpoint to a selection of LLMs hosted and run by ARC. It is based on the Open WebUI platform, and integrates several inference and integration capabilities, including retrieval-augmented generation (RAG), web search, vision, and image generation.

Access

  • All Virginia Tech students, faculty, and staff may access the service at https://llm-api.arc.vt.edu/api/v1/ using a personal API key. No separate ARC account is required to use the API.

  • There is no charge to individual users for accessing the hosted models via the API.

  • Users must generate an API key through https://llm.arc.vt.edu User profile > Settings > Account > API keys. Keys are unique to each user and must be kept confidential. Do not share your keys.

Restrictions

  • Data classification restriction. Researchers can use this tool for high-risk data. This service is approved by VT Information Technology Security Office (ITSO) for processing sensitive or regulated data. However, researchers are reminded to consult with VT Privacy and Research Data Protection Program (PRDP) and the Office of Export and Secure Research Compliance regarding the storage and analysis of high-risk data to comply with specific regulations. Note that some high-risk data (e.g. data regulated by DFARS, ITAR, etc.) require additional protections and the LLM might not approved for use with those data types.

  • Users are subject to API and web interface limits to ensure fair usage: 60 requests per minute, 1000 requests per hour, and 2000 requests in a 3-hour sliding window.

Models

ARC currently runs three state-of-the-art models. ARC will add or remove models and scale instances dynamically to respond to user demand. You may select your preferred model in the request settings, e.g. "model": "GLM-4.5-Air".

  • Z.ai GLM-4.5-Air (see model card on Hugging Face). High-performance public model.

  • QuantTrio GLM-4.5V-AWQ (see model card on Hugging Face). GLM variant with vision capabilities.

  • OpenAI gpt-oss-120b (see model card on Hugging Face). OpenAI’s flagship public model.

Security

This service is hosted entirely on-premises within the ARC infrastructure. No data is sent to any third party outside of the university. All user interactions are logged and preserved in compliance with VT IT Security Office Data Protection policies.

Disclaimer

ARC has implemented safeguards to mitigate the risk of generating unlawful, harmful, or otherwise inappropriate content. Despite these measures, LLMs may still produce inaccurate, misleading, biased, or harmful information. Use of this service is undertaken entirely at the user’s own discretion and risk. The service is provided “as is”, and, to the fullest extent permitted by applicable law, ARC and VT expressly disclaim all warranties, whether express or implied, as well as any liability for damages, losses, or adverse consequences that may result from the use of, or reliance upon, the outputs generated by the models. By using this service, the user acknowledges and accepts these conditions, and agrees to comply with all applicable terms and conditions governing the use of the hosted models, associated software, and underlying platforms.

Examples

Please read the OpenAI API documentation for a comprehensive guide to understand the different ways to interact with the LLM. You may also consult the Open WebUI documentation for API endpoints for additional examples involing Retrieval Augmented Generation (RAG), knowledge collections, image generation, tool calling, web search, etc.

Shell API

Use this API to interact with the LLMs directly from the command line.

Chat completions

Submit a query to a model.

curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "GLM-4.5-Air",
        "messages": [{
           "role":"user",
           "content":"Why is the sky blue?"
        }]
      }'

Document upload

Upload a document to the LLM. Every file is assigned a unique file id. You can use the file ids to do Retrieval Augmented Generation (RAG).

curl -X POST \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Accept: application/json" \
  -F "file=@/path/to/file.pdf" https://llm-api.arc.vt.edu/api/v1/files/

Retrieval Augmented Generation (RAG)

Upload a file, extract its file id, and submit a query about the document to the LLM.

## Upload document and get file ID
file_id=$(curl -s -X POST \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Accept: application/json" \
  -F "file=@document.pdf" \
  https://llm-api.arc.vt.edu/api/v1/files/ | jq -r '.id')

## Use the file ID in the request
request=$(jq -n \
  --arg model "GLM-4.5-Air" \
  --arg file_id "$file_id" \
  --arg prompt "Create a summary of the document" \
  '{
    model: $model,
    messages: [{role: "user", content: $prompt}],
    files: [{type: "file", id: $file_id}]
  }')

## Make the chat completion request with the file
curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d "$request"

Reasoning effort

You may change the reasoning effort of gpt-oss-120b to (low, medium (default), high).

curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "gpt-oss-120b",
        "messages": [{
           "role":"user",
           "content":"Why is the sky blue?"
        }],
        "reasoning_effort": "high"
      }'

Image generation (Qwen-Image)

This approach generates an image using Qwen/Qwen-Image. The base model selection does not affect the image generation engine.

API_KEY="sk-YOUR-API-KEY"

RESPONSE=$(jq -n \
  --arg model "GLM-4.5V-AWQ" \
  --arg prompt "Generate a picture of a chocolate lab dog" \
  --arg tool_id "image_generation_and_edit" \
  '{
    model: $model,
    messages: [{role: "user", content: $prompt}],
    tool_ids: [$tool_id]
  }' | \
  curl -s -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
       -H "Authorization: Bearer $API_KEY" \
       -H "Content-Type: application/json" \
       -d @-)

## Extract the file ID from the response
FILE_ID=$(echo "$RESPONSE" | jq -r '.sources[0].document[0] | fromjson | .file_id')

## Download the image using the file ID
curl -s -L -o "image.png" \
  -H "Authorization: Bearer $API_KEY" \
  "https://llm-api.arc.vt.edu/api/v1/files/$FILE_ID/content"

Python API

Certain libraries are required for the API use in Python such as openai and requests. You may install them using pip:

pip install openai requests

Chat completions

Submit a query to a model.

from openai import OpenAI
import argparse

# Modify OpenAI's API key and API base to use the server.
openai_api_key = "sk-YOUR-API-KEY"
openai_api_base = "https://llm-api.arc.vt.edu/api/v1"

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Virginia Tech known for?"},
]

Document upload

Upload a document to the LLM. Every file is assigned a unique file id. You can use the file ids to do Retrieval Augmented Generation (RAG).

import os
import requests

api_key="sk-YOUR-API-KEY"
base_url="https://llm-api.arc.vt.edu/api/v1/files/"
file_path="document.pdf"

if os.path.isfile(file_path):
    with open(file_path, "rb") as file:
        response = requests.post(
            base_url,
            headers={
                "Authorization": f"Bearer {api_key}",
                "Accept": "application/json",
            },
            files={"file": file},
        )        
        if response.status_code == 200:
            print(f"Uploaded {file_path} successfully!")
        else:
            print(f"Failed to upload {file_path}. Status code: {response.status_code}")
else:
    print(f"File not found")

Retrieval Augmented Generation (RAG)

Upload a file, extract its file id, and submit a query about the document to the LLM.

import os
import requests
import json

api_key="sk-YOUR-API-KEY"
file_path="document.pdf"

def upload_file(file_path):
    if not os.path.isfile(file_path):
        raise FileNotFoundError(f"File not found: {file_path}")

    with open(file_path, "rb") as file:
        response = requests.post(
            "https://llm-api.arc.vt.edu/api/v1/files/",
            headers={
                "Authorization": f"Bearer {api_key}",
                "Accept": "application/json",
            },
            files={"file": file},
        )

    if response.status_code == 200:
        data = response.json()
        file_id = data.get("id")
        if file_id:
            print(f"Uploaded {file_path} successfully! File ID: {file_id}")
            return file_id
        else:
            raise RuntimeError("Upload succeeded but no file id returned.")
    else:
        raise RuntimeError(f"Failed to upload {file_path}. Status code: {response.status_code}")

file_id = upload_file(file_path)

url = "https://llm-api.arc.vt.edu/api/v1/chat/completions"
headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
    }
data = {
    "model": "GLM-4.5-Air",
    "messages": [{
        "role": "user",
        "content": "Create a summary of the document"}],
    "files": [{"type": "file", "id": file_id}],
}

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.text)

Image generation (FLUX.1-dev)

This approach generates an image using black-forest-labs/FLUX.1-dev. The base model selection does not affect the image generation engine.

import base64
import requests
from urllib.parse import urlparse
from openai import OpenAI

openai_api_key = "sk-YOUR-API-KEY"
openai_api_base = "https://llm-api.arc.vt.edu/api/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

response = client.images.generate(
    model="gpt-oss-120b",
    prompt="A gray tabby cat hugging an otter with an orange scarf",
    size="512x512",
)

base_url = urlparse(openai_api_base)
image_url = f"{base_url.scheme}://{base_url.netloc}" + response[0].url
headers = {"Authorization": f"Bearer {openai_api_key}"}
img_data = requests.get(image_url, headers=headers).content

with open("output.png", 'wb') as handler:
    handler.write(img_data)

Image to text Python API

import requests
import base64
from pathlib import Path

url = "https://llm-api.arc.vt.edu/api/v1/chat/completions"
openai_api_key = "sk-YOUR-API-KEY"
image_path = "bonnie.jpg"

def convert_image_to_base64(image_path: str) -> str:
    image_path = Path(image_path)
    if not image_path.exists():
        raise FileNotFoundError(f"Image file not found: {image_path}")
    
    with open(image_path, "rb") as img_file:
        encoded = base64.b64encode(img_file.read()).decode("utf-8")
    return encoded

headers = {
    "Authorization": f"Bearer {openai_api_key}",
    "Content-Type": "application/json"
}

image_b64 = convert_image_to_base64(image_path)

data = {
    "model": "GLM-4.5V-AWQ",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe the image"
                },
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=data, timeout=30)
print(response.json()["choices"][0]["message"]["content"])

Image edition (Qwen-Image-Edit-2509)

This approach modifies an image using Qwen/Qwen-Image-Edit-2509. The base model selection does not affect the image generation engine.

import base64
import json
import requests
from pathlib import Path

BASE_URL = "https://llm-api.arc.vt.edu/api/v1"
API_KEY = "sk-YOUR-API-KEY"

def convert_image_to_base64(image_path: str) -> str:
    image_path = Path(image_path)
    if not image_path.exists():
        raise FileNotFoundError(f"Image file not found: {image_path}")
    
    with open(image_path, "rb") as img_file:
        encoded = base64.b64encode(img_file.read()).decode("utf-8")
    return encoded

def request_image_edit(edit_instruction: str, image_path: str) -> dict:
    print("Submitting request for image edit...")

    url = f"{BASE_URL}/chat/completions"
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json",
    }

    image_b64 = convert_image_to_base64(image_path)

    payload = {
        "model": "GLM-4.5V-AWQ",
        "messages": [{
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": edit_instruction
                },
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}
                },
            ]}],
        "tool_ids": ["image_generation_and_edit"],
    }

    # A long timeout is important here, image generation can take a while!
    resp = requests.post(url, headers=headers, json=payload, timeout=300)
    resp.raise_for_status()
    return resp.json()

def extract_file_id(result: dict) -> str:
    sources = result.get("sources") or []
    for src in sources:
        docs = src.get("document") or []
        for doc in docs:
            if isinstance(doc, str):
                try:
                    parsed = json.loads(doc)
                except json.JSONDecodeError:
                    continue
                if isinstance(parsed, dict) and parsed.get("file_id"):
                    return parsed["file_id"]    

def download_file_by_id(file_id: str, out_path: str) -> None:
    if not API_KEY:
        raise RuntimeError("API_KEY is not set")
    url = f"{BASE_URL}/files/{file_id}/content"
    headers = {"Authorization": f"Bearer {API_KEY}"}
    r = requests.get(url, headers=headers, timeout=60)
    r.raise_for_status()
    Path(out_path).write_bytes(r.content)


# Edit these two lines for your run:
EDIT_INSTRUCTION = "Change the weather in the image from sunny to rainy"
INPUT_IMAGE = "input_image.png"
OUTPUT_IMAGE = "output_image.png"

result = request_image_edit(EDIT_INSTRUCTION, INPUT_IMAGE)
file_id = extract_file_id(result)
download_file_by_id(file_id, OUTPUT_IMAGE)