#performance
4 approved public terms with this tag.
Latency
Rascunho de traducao automatica (Portuguese) for "Latency": The time delay between initiating an action and receiving the first response. In networking, latency is the round-trip time for a data packet; in AI, it often refers to time-to-first-token or end-to-end inference time. Lower latency means faster, more responsive user experiences.
“Exemplo em rascunho: The new model has lower latency but slightly less accuracy — a classic speed/quality trade-off.”
Throughput
Rascunho de traducao automatica (Portuguese) for "Throughput": The amount of work a system can process in a given time period. In APIs it's usually measured in requests per second; in AI inference it's tokens per second. Throughput and latency are related but distinct — a system can have high throughput while still having high latency for individual requests.
“Exemplo em rascunho: The inference cluster achieved 10,000 tokens per second throughput across all concurrent users.”
Ate
Rascunho de traducao automatica (Portuguese) for "Ate": Did something perfectly, completely, and impressively. To "eat" (past tense: ate) a performance, look, or challenge means to dominate it fully with no leftovers — you consumed it entirely. Originates from ballroom culture and drag slang, now used broadly for anyone who executes something flawlessly.
“Exemplo em rascunho: She ate that chorus — the whole arena was on their feet.”
Serve
Rascunho de traducao automatica (Portuguese) for "Serve": To deliver an impressive, stunning, or top-tier look, performance, or presence. To "serve" means to offer something exceptional for others to receive and appreciate — like a waiter presenting a perfect dish. Rooted in ballroom and drag culture, it now applies to any context where someone presents themselves at their absolute best.
“Exemplo em rascunho: She walked into the room serving full corporate-casual realness.”