bytedanceVision

ByteDance: UI-TARS 7B

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

Context
128k tokens
Input / 1M tokens
$0.10
Output / 1M tokens
$0.20
Benchmark

Pricing

Input tokens$0.10 per 1M tokens
Output tokens$0.20 per 1M tokens
Cache read$0.10 per 1M tokens

Technical details

Model IDbytedance/ui-tars-1.5-7b
Context window128k tokens
Input modalitiesimage, text
Output modalitiestext
TokenizerOther
Max output tokens2,048

Use with MegaBrain

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://getmegabrain.com/api/gateway/v1',
  apiKey: process.env.MEGABRAIN_API_KEY,
})

const response = await client.chat.completions.create({
  model: 'bytedance/ui-tars-1.5-7b',
  messages: [{ role: 'user', content: 'Hello!' }],
})

Ready to use ByteDance: UI-TARS 7B ?

Get an API key and start making requests in minutes.

Get an API key