Available Models
NEAR AI Cloud provides access to leading AI models, each optimized for different use cases ranging from advanced reasoning and tool calling to long-context processing and multilingual tasks. All models run in secure TEE environments with transparent, pay-per-use pricing.
Quick Reference
| Model | Model ID | Context | Input Price | Output Price |
|---|---|---|---|---|
| Claude Opus 4.6 | | 200K | $5.00/M | $25.00/M |
| Claude Sonnet 4.5 | | 200K | $3.00/M | $15.50/M |
| FLUX.2-klein-4B | | 128K | $1.00/M | $1.00/M |
| DeepSeek V3.1 | | 128K | $1.05/M | $3.10/M |
| Gemini 3 Pro Preview | | 1000K | $1.25/M | $15.00/M |
| OpenAI GPT-5.2 | | 400K | $1.80/M | $15.50/M |
| GPT OSS 120B | | 131K | $0.15/M | $0.55/M |
| Qwen3 30B A3B Instruct 2507 | | 262K | $0.15/M | $0.55/M |
| GLM 4.7 | | 131K | $0.85/M | $3.30/M |
Model Details
Claude Opus 4.6
Anthropic's most intelligent model for building agents and coding
Model ID:
anthropic/claude-opus-4-6
Claude Sonnet 4.5
Anthropic's Claude Sonnet 4.5 - a powerful, efficient model balancing intelligence and speed. Excels at complex reasoning, coding, and creative tasks with 200K context window. Anonymized, not TEE-protected.
Model ID:
anthropic/claude-sonnet-4-5
FLUX.2-klein-4B
The FLUX.2 [klein] model family are our fastest image models to date. FLUX.2 [klein] unifies generation and editing in a single compact architecture, delivering state-of-the-art quality with end-to-end inference in as low as under a second. Built for applications that require real-time image generation without sacrificing quality.
Model ID:
black-forest-labs/FLUX.2-klein-4B
DeepSeek V3.1
DeepSeek V3.1 is a hybrid model that supports both thinking mode and non-thinking mode. Compared to the previous version, this upgrade brings improvements in multiple aspects:
Model ID:
deepseek-ai/DeepSeek-V3.1
Gemini 3 Pro Preview
Google's Gemini 3 Pro Preview - a highly capable multimodal model with an industry-leading 1M token context window. Optimized for complex reasoning, code generation, and long document analysis. Anonymized, not TEE-protected.
Model ID:
google/gemini-3-pro
OpenAI GPT-5.2
OpenAI GPT-5.2 with 400k context window. Anonymized endpoint optimized for deep reasoning and large-context workflows.
Model ID:
openai/gpt-5.2
GPT OSS 120B
gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation.
Model ID:
openai/gpt-oss-120b
Qwen3 30B A3B Instruct 2507
Qwen3-30B-A3B-Instruct-2507 is a mixture-of-experts (MoE) causal language model featuring 30.5 billion total parameters and 3.3 billion activated parameters per inference. It supports ultra-long context up to 262 K tokens and operates exclusively in non-thinking mode, delivering strong enhancements in instruction following, reasoning, logical comprehension, mathematics, coding, multilingual understanding, and alignment with user preferences.
Model ID:
Qwen/Qwen3-30B-A3B-Instruct-2507
GLM 4.7
GLM 4.7 is the latest model from Z.ai, it's an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforcement learning, GLM-4.5 achieves strong performance across agentic, reasoning, and coding (ARC) tasks, scoring 70.1% on TAU-Bench, 91.0% on AIME 24, and 64.2% on SWE-bench Verified.
Model ID:
zai-org/GLM-4.7