llm-api.arc.vt.edu
Description
https://llm-api.arc.vt.edu/api/v1/ provides an OpenAI-compatible API endpoint to a selection of LLMs hosted and run by ARC. It is based on the Open WebUI platform, and integrates several inference and integration capabilities, including retrieval-augmented generation (RAG), web search, vision, and image generation.
Access
All Virginia Tech students, faculty, and staff may access the service at
https://llm-api.arc.vt.edu/api/v1/using a personal API key. No separate ARC account is required to use the API.There is no charge to individual users for accessing the hosted models via the API.
Users must generate an API key through https://llm.arc.vt.edu
User profile > Settings > Account > API keys. Keys are unique to each user and must be kept confidential. Do not share your keys.

Restrictions
Data classification restriction. Researchers can use this tool for high-risk data. This service is approved by VT Information Technology Security Office (ITSO) for processing sensitive or regulated data. However, researchers are reminded to consult with VT Privacy and Research Data Protection Program (PRDP) and the Office of Export and Secure Research Compliance regarding the storage and analysis of high-risk data to comply with specific regulations. Note that some high-risk data (e.g. data regulated by DFARS, ITAR, etc.) require additional protections and the LLM might not approved for use with those data types.
Users are subject to API and web interface limits to ensure fair usage: 60 requests per minute, 1000 requests per hour, and 2000 requests in a 3-hour sliding window.
Models
ARC currently runs three state-of-the-art models. ARC will add or remove models and scale instances dynamically to respond to user demand. You may select your preferred model in the request settings, e.g. "model": "GLM-4.5-Air".
Z.ai
GLM-4.5-Air(see model card on Hugging Face). High-performance public model.QuantTrio
GLM-4.5V-AWQ(see model card on Hugging Face). GLM variant with vision capabilities.OpenAI
gpt-oss-120b(see model card on Hugging Face). OpenAI’s flagship public model.
Security
This service is hosted entirely on-premises within the ARC infrastructure. No data is sent to any third party outside of the university. All user interactions are logged and preserved in compliance with VT IT Security Office Data Protection policies.
Disclaimer
ARC has implemented safeguards to mitigate the risk of generating unlawful, harmful, or otherwise inappropriate content. Despite these measures, LLMs may still produce inaccurate, misleading, biased, or harmful information. Use of this service is undertaken entirely at the user’s own discretion and risk. The service is provided “as is”, and, to the fullest extent permitted by applicable law, ARC and VT expressly disclaim all warranties, whether express or implied, as well as any liability for damages, losses, or adverse consequences that may result from the use of, or reliance upon, the outputs generated by the models. By using this service, the user acknowledges and accepts these conditions, and agrees to comply with all applicable terms and conditions governing the use of the hosted models, associated software, and underlying platforms.
Examples
Please read the OpenAI API documentation for a comprehensive guide to understand the different ways to interact with the LLM. You may also consult the Open WebUI documentation for API endpoints for additional examples involing Retrieval Augmented Generation (RAG), knowledge collections, image generation, tool calling, web search, etc.
Shell API
Use this API to interact with the LLMs directly from the command line.
Chat completions
Submit a query to a model.
curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
-H "Authorization: Bearer sk-YOUR-API-KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "GLM-4.5-Air",
"messages": [{
"role":"user",
"content":"Why is the sky blue?"
}]
}'
Document upload
Upload a document to the LLM. Every file is assigned a unique file id. You can use the file ids to do Retrieval Augmented Generation (RAG).
curl -X POST \
-H "Authorization: Bearer sk-YOUR-API-KEY" \
-H "Accept: application/json" \
-F "file=@/path/to/file.pdf" https://llm-api.arc.vt.edu/api/v1/files/
Retrieval Augmented Generation (RAG)
Upload a file, extract its file id, and submit a query about the document to the LLM.
## Upload document and get file ID
file_id=$(curl -s -X POST \
-H "Authorization: Bearer sk-YOUR-API-KEY" \
-H "Accept: application/json" \
-F "file=@document.pdf" \
https://llm-api.arc.vt.edu/api/v1/files/ | jq -r '.id')
## Use the file ID in the request
request=$(jq -n \
--arg model "GLM-4.5-Air" \
--arg file_id "$file_id" \
--arg prompt "Create a summary of the document" \
'{
model: $model,
messages: [{role: "user", content: $prompt}],
files: [{type: "file", id: $file_id}]
}')
## Make the chat completion request with the file
curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
-H "Authorization: Bearer sk-YOUR-API-KEY" \
-H "Content-Type: application/json" \
-d "$request"
Web search
Enable the server:websearch tool to let the model query online.
curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
-H "Authorization: Bearer sk-YOUR-API-KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "GLM-4.5-Air",
"messages": [{
"role":"user",
"content":"Who is the US president right now?"
}],
"tool_ids": ["server:websearch"]
}'
Reasoning effort
You may change the reasoning effort of gpt-oss-120b to (low, medium (default), high).
curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
-H "Authorization: Bearer sk-YOUR-API-KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-oss-120b",
"messages": [{
"role":"user",
"content":"Why is the sky blue?"
}],
"reasoning_effort": "high"
}'
Image generation (Qwen-Image)
This approach generates an image using Qwen/Qwen-Image. The base model selection does not affect the image generation engine.
API_KEY="sk-YOUR-API-KEY"
RESPONSE=$(jq -n \
--arg model "GLM-4.5V-AWQ" \
--arg prompt "Generate a picture of a chocolate lab dog" \
--arg tool_id "image_generation_and_edit" \
'{
model: $model,
messages: [{role: "user", content: $prompt}],
tool_ids: [$tool_id]
}' | \
curl -s -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d @-)
## Extract the file ID from the response
FILE_ID=$(echo "$RESPONSE" | jq -r '.sources[0].document[0] | fromjson | .file_id')
## Download the image using the file ID
curl -s -L -o "image.png" \
-H "Authorization: Bearer $API_KEY" \
"https://llm-api.arc.vt.edu/api/v1/files/$FILE_ID/content"
Python API
Certain libraries are required for the API use in Python such as openai and requests. You may install them using pip:
pip install openai requests
Chat completions
Submit a query to a model.
from openai import OpenAI
import argparse
# Modify OpenAI's API key and API base to use the server.
openai_api_key = "sk-YOUR-API-KEY"
openai_api_base = "https://llm-api.arc.vt.edu/api/v1"
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is Virginia Tech known for?"},
]
Document upload
Upload a document to the LLM. Every file is assigned a unique file id. You can use the file ids to do Retrieval Augmented Generation (RAG).
import os
import requests
api_key="sk-YOUR-API-KEY"
base_url="https://llm-api.arc.vt.edu/api/v1/files/"
file_path="document.pdf"
if os.path.isfile(file_path):
with open(file_path, "rb") as file:
response = requests.post(
base_url,
headers={
"Authorization": f"Bearer {api_key}",
"Accept": "application/json",
},
files={"file": file},
)
if response.status_code == 200:
print(f"Uploaded {file_path} successfully!")
else:
print(f"Failed to upload {file_path}. Status code: {response.status_code}")
else:
print(f"File not found")
Retrieval Augmented Generation (RAG)
Upload a file, extract its file id, and submit a query about the document to the LLM.
import os
import requests
import json
api_key="sk-YOUR-API-KEY"
file_path="document.pdf"
def upload_file(file_path):
if not os.path.isfile(file_path):
raise FileNotFoundError(f"File not found: {file_path}")
with open(file_path, "rb") as file:
response = requests.post(
"https://llm-api.arc.vt.edu/api/v1/files/",
headers={
"Authorization": f"Bearer {api_key}",
"Accept": "application/json",
},
files={"file": file},
)
if response.status_code == 200:
data = response.json()
file_id = data.get("id")
if file_id:
print(f"Uploaded {file_path} successfully! File ID: {file_id}")
return file_id
else:
raise RuntimeError("Upload succeeded but no file id returned.")
else:
raise RuntimeError(f"Failed to upload {file_path}. Status code: {response.status_code}")
file_id = upload_file(file_path)
url = "https://llm-api.arc.vt.edu/api/v1/chat/completions"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
data = {
"model": "GLM-4.5-Air",
"messages": [{
"role": "user",
"content": "Create a summary of the document"}],
"files": [{"type": "file", "id": file_id}],
}
response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.text)
Image generation (FLUX.1-dev)
This approach generates an image using black-forest-labs/FLUX.1-dev. The base model selection does not affect the image generation engine.
import base64
import requests
from urllib.parse import urlparse
from openai import OpenAI
openai_api_key = "sk-YOUR-API-KEY"
openai_api_base = "https://llm-api.arc.vt.edu/api/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
response = client.images.generate(
model="gpt-oss-120b",
prompt="A gray tabby cat hugging an otter with an orange scarf",
size="512x512",
)
base_url = urlparse(openai_api_base)
image_url = f"{base_url.scheme}://{base_url.netloc}" + response[0].url
headers = {"Authorization": f"Bearer {openai_api_key}"}
img_data = requests.get(image_url, headers=headers).content
with open("output.png", 'wb') as handler:
handler.write(img_data)
Image to text Python API
import requests
import base64
from pathlib import Path
url = "https://llm-api.arc.vt.edu/api/v1/chat/completions"
openai_api_key = "sk-YOUR-API-KEY"
image_path = "bonnie.jpg"
def convert_image_to_base64(image_path: str) -> str:
image_path = Path(image_path)
if not image_path.exists():
raise FileNotFoundError(f"Image file not found: {image_path}")
with open(image_path, "rb") as img_file:
encoded = base64.b64encode(img_file.read()).decode("utf-8")
return encoded
headers = {
"Authorization": f"Bearer {openai_api_key}",
"Content-Type": "application/json"
}
image_b64 = convert_image_to_base64(image_path)
data = {
"model": "GLM-4.5V-AWQ",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe the image"
},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}
}
]
}
]
}
response = requests.post(url, headers=headers, json=data, timeout=30)
print(response.json()["choices"][0]["message"]["content"])
Image edition (Qwen-Image-Edit-2509)
This approach modifies an image using Qwen/Qwen-Image-Edit-2509. The base model selection does not affect the image generation engine.
import base64
import json
import requests
from pathlib import Path
BASE_URL = "https://llm-api.arc.vt.edu/api/v1"
API_KEY = "sk-YOUR-API-KEY"
def convert_image_to_base64(image_path: str) -> str:
image_path = Path(image_path)
if not image_path.exists():
raise FileNotFoundError(f"Image file not found: {image_path}")
with open(image_path, "rb") as img_file:
encoded = base64.b64encode(img_file.read()).decode("utf-8")
return encoded
def request_image_edit(edit_instruction: str, image_path: str) -> dict:
print("Submitting request for image edit...")
url = f"{BASE_URL}/chat/completions"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
image_b64 = convert_image_to_base64(image_path)
payload = {
"model": "GLM-4.5V-AWQ",
"messages": [{
"role": "user",
"content": [
{
"type": "text",
"text": edit_instruction
},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}
},
]}],
"tool_ids": ["image_generation_and_edit"],
}
# A long timeout is important here, image generation can take a while!
resp = requests.post(url, headers=headers, json=payload, timeout=300)
resp.raise_for_status()
return resp.json()
def extract_file_id(result: dict) -> str:
sources = result.get("sources") or []
for src in sources:
docs = src.get("document") or []
for doc in docs:
if isinstance(doc, str):
try:
parsed = json.loads(doc)
except json.JSONDecodeError:
continue
if isinstance(parsed, dict) and parsed.get("file_id"):
return parsed["file_id"]
def download_file_by_id(file_id: str, out_path: str) -> None:
if not API_KEY:
raise RuntimeError("API_KEY is not set")
url = f"{BASE_URL}/files/{file_id}/content"
headers = {"Authorization": f"Bearer {API_KEY}"}
r = requests.get(url, headers=headers, timeout=60)
r.raise_for_status()
Path(out_path).write_bytes(r.content)
# Edit these two lines for your run:
EDIT_INSTRUCTION = "Change the weather in the image from sunny to rainy"
INPUT_IMAGE = "input_image.png"
OUTPUT_IMAGE = "output_image.png"
result = request_image_edit(EDIT_INSTRUCTION, INPUT_IMAGE)
file_id = extract_file_id(result)
download_file_by_id(file_id, OUTPUT_IMAGE)
