Member-only story
Best LLM for 🇫🇮Finnish (low-resource languages) Translation?
In the previous experiment, we concluded that google/madlad400–3b-mt is the best choice for running machine translation locally. The newly released Qwen3 “supports 100+ languages and dialects.” The Qwen series has consistently demonstrated strong performance in translating between resource-rich languages like English and Chinese. This time, we’re testing its support for the low-resource language Finnish.
Using the same test data as before, here are Qwen3’s inference code and prompt:
from autodevice import AutoModelForCausalLM, get_device
from eval_fi_cn import benchmark
import time
from transformers import AutoModelForCausalLM, AutoTokenizer
device = get_device()
model_name = "Qwen/Qwen3-4B"
base_prompt = "Translate the following text from Finnish into English."
print(f"Using model '{model_name}' with prompt '{base_prompt}'")
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
#device_map=device
)
model = model.to(device)
def generateBatch(batch):
prompts=[]
for i, text in enumerate(batch, start=1):
print(f'\033[94m{i}. {text}\033[0m')
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
{"role": "user", "content":
f"{base_prompt}\n{text}."
},
]
prompt =…