コンテンツへスキップ

#capacity

1 approved public terms with this tag.

Throughput

/ˈθruːpaʊt/noun
Technology

機械支援の翻訳下書き (Japanese) for "Throughput": The amount of work a system can process in a given time period. In APIs it's usually measured in requests per second; in AI inference it's tokens per second. Throughput and latency are related but distinct — a system can have high throughput while still having high latency for individual requests.

例文の下書き: The inference cluster achieved 10,000 tokens per second throughput across all concurrent users.