The tokenizers have a strong bias towards English: "The dog eats the apples" is 5 tokens, "El perro come las manzanas" is 8 tokens, and many Japanese characters end up using two integer tokens for each character of text.
@fuzzychef 28!
@simon @fuzzychef now the challenge is to find a German word with the most tokens that has a Wikipedia page. Can you try https://en.wikipedia.org/wiki/Donaudampfschiffahrtselektrizit%C3%A4tenhauptbetriebswerkbauunterbeamtengesellschaft?