llm-api.arc.vt.edu

Description

https://llm-api.arc.vt.edu/api/v1/ provides an OpenAI-compatible API endpoint to a selection of LLMs hosted and run by ARC. It is based on the Open WebUI platform, and integrates several inference and integration capabilities, including retrieval-augmented generation (RAG), web search, vision, and image generation.

Access

All Virginia Tech students, faculty, and staff may access the service at https://llm-api.arc.vt.edu/api/v1/ using a personal API key. No separate ARC account is required to use the API.
There is no charge to individual users for accessing the hosted models via the API.
Users must generate an API key through https://llm.arc.vt.edu User profile > Settings > Account > API keys. Keys are unique to each user and must be kept confidential. Do not share your keys.

Restrictions

Data classification restriction. Researchers can use this tool for high-risk data. This service is approved by VT Information Technology Security Office (ITSO) for processing sensitive or regulated data. However, researchers are reminded to consult with VT Privacy and Research Data Protection Program (PRDP) and the Office of Export and Secure Research Compliance regarding the storage and analysis of high-risk data to comply with specific regulations. Note that some high-risk data (e.g. data regulated by DFARS, ITAR, etc.) require additional protections and the LLM might not approved for use with those data types.
Users are subject to API and web interface limits to ensure fair usage: 60 requests per minute, 1000 requests per hour, and 3000 requests in a 3-hour sliding window.

Models

ARC currently runs three state-of-the-art models. ARC will add or remove models and scale instances dynamically to respond to user demand. You may select your preferred model in the request settings, e.g. "model": "gpt-oss-120b".

OpenAI gpt-oss-120b (see model card on Hugging Face). OpenAI’s flagship public model, best for fast responses.
Moonshot AI Kimi-K2-Thinking (see model card on Hugging Face). Top-performing model, best for complex tasks.
QuantTrio GLM-4.5V-AWQ (see model card on Hugging Face). GLM variant with vision capabilities. Use for image creation, edition, and analysis.

Security

This service is hosted entirely on-premises within the ARC infrastructure. No data is sent to any third party outside of the university. All user interactions are logged and preserved in compliance with VT IT Security Office Data Protection policies.

Disclaimer

ARC has implemented safeguards to mitigate the risk of generating unlawful, harmful, or otherwise inappropriate content. Despite these measures, LLMs may still produce inaccurate, misleading, biased, or harmful information. Use of this service is undertaken entirely at the user’s own discretion and risk. The service is provided “as is”, and, to the fullest extent permitted by applicable law, ARC and VT expressly disclaim all warranties, whether express or implied, as well as any liability for damages, losses, or adverse consequences that may result from the use of, or reliance upon, the outputs generated by the models. By using this service, the user acknowledges and accepts these conditions, and agrees to comply with all applicable terms and conditions governing the use of the hosted models, associated software, and underlying platforms.

Examples

Please read the OpenAI API documentation for a comprehensive guide to understand the different ways to interact with the LLM. You may also consult the Open WebUI documentation for API endpoints for additional examples involing Retrieval Augmented Generation (RAG), knowledge collections, image generation, tool calling, web search, etc.

Shell API

Use this API to interact with the LLMs directly from the command line.

Chat completions

Submit a query to a model.

curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "gpt-oss-120b",
        "messages": [{
           "role":"user",
           "content":"Why is the sky blue?"
        }]
      }'

Document upload

Upload a document to the LLM. Every file is assigned a unique file id. You can use the file ids to do Retrieval Augmented Generation (RAG).

curl -X POST \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Accept: application/json" \
  -F "file=@/path/to/file.pdf" https://llm-api.arc.vt.edu/api/v1/files/

Retrieval Augmented Generation (RAG)

Upload a file, extract its file id, and submit a query about the document to the LLM.

## Upload document and get file ID
file_id=$(curl -s -X POST \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Accept: application/json" \
  -F "file=@document.pdf" \
  https://llm-api.arc.vt.edu/api/v1/files/ | jq -r '.id')

## Use the file ID in the request
request=$(jq -n \
  --arg model "gpt-oss-120b" \
  --arg file_id "$file_id" \
  --arg prompt "Create a summary of the document" \
  '{
    model: $model,
    messages: [{role: "user", content: $prompt}],
    files: [{type: "file", id: $file_id}]
  }')

## Make the chat completion request with the file
curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d "$request"

Web search

Enable the server:websearch tool to let the model query online.

curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "gpt-oss-120b",
        "messages": [{
           "role":"user",
           "content":"Who is the US president right now?"          
        }],
        "tool_ids": ["server:websearch"]
      }'

Reasoning effort

You may change the reasoning effort of gpt-oss-120b to (low, medium (default), high).

curl -X POST "https://llm-api.arc.vt.edu/api/v1/chat/completions" \
  -H "Authorization: Bearer sk-YOUR-API-KEY" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "gpt-oss-120b",
        "messages": [{
           "role":"user",
           "content":"Why is the sky blue?"
        }],
        "reasoning_effort": "high"
      }'

Image generation

This approach generates an image using Qwen/Qwen-Image.

RESPONSE=$(curl -s -X POST "https://llm-api.arc.vt.edu/api/v1/images/generations" \
   -H "Authorization: Bearer sk-YOUR-API-KEY" \
   -H "Content-Type: application/json" \
   -d '{       
       "prompt": "Generate a picture of a white lab dog",
       "size": "512x512"
      }'
)

# Extract the file ID from the response
FILE_ID=$(echo "$RESPONSE" | jq -r '.[0].url | split("/") | .[4]')

# Download the image using the file ID
curl -s -L -o "image.png" \
  -H "Authorization: Bearer $API_KEY" \
  "https://llm-api.arc.vt.edu/api/v1/files/$FILE_ID/content"

Image edition

This approach edits an image using Qwen/Qwen-Image-Edit-2509.

API_KEY="sk-YOUR-API-KEY"
INPUT="input.png"
OUTPUT="output.png"

# Encode input image
IMG_B64=$(base64 -w0 "$INPUT")

# Submit the image edit request
RESPONSE=$(curl -s -X POST "https://llm-api.arc.vt.edu/api/v1/images/edit" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d @- <<EOF
{
  "image": "data:image/png;base64,$IMG_B64",
  "prompt": "Change the color of the red blanket to blue"
}
EOF
)

# Extract the file ID from the response
FILE_ID=$(echo "$RESPONSE" | jq -r '.[0].url | split("/") | .[4]')

# Download the image using the file ID
curl -s -L -o $OUTPUT \
  -H "Authorization: Bearer $API_KEY" \
  "https://llm-api.arc.vt.edu/api/v1/files/$FILE_ID/content"

Python API

Certain libraries are required for the API use in Python such as openai and requests. You may install them using pip:

pip install openai requests