#performance
4 approved public terms with this tag.
Latency
The time delay between initiating an action and receiving the first response. In networking, latency is the round-trip time for a data packet; in AI, it often refers to time-to-first-token or end-to-end inference time. Lower latency means faster, more responsive user experiences.
“The new model has lower latency but slightly less accuracy — a classic speed/quality trade-off.”
Throughput
The amount of work a system can process in a given time period. In APIs it's usually measured in requests per second; in AI inference it's tokens per second. Throughput and latency are related but distinct — a system can have high throughput while still having high latency for individual requests.
“The inference cluster achieved 10,000 tokens per second throughput across all concurrent users.”
Ate
Did something perfectly, completely, and impressively. To "eat" (past tense: ate) a performance, look, or challenge means to dominate it fully with no leftovers — you consumed it entirely. Originates from ballroom culture and drag slang, now used broadly for anyone who executes something flawlessly.
“She ate that chorus — the whole arena was on their feet.”
Serve
To deliver an impressive, stunning, or top-tier look, performance, or presence. To "serve" means to offer something exceptional for others to receive and appreciate — like a waiter presenting a perfect dish. Rooted in ballroom and drag culture, it now applies to any context where someone presents themselves at their absolute best.
“She walked into the room serving full corporate-casual realness.”