Member-only story

Best LLM for 🇫🇮Finnish (low-resource languages) Translation?

4 min readMay 14, 2025

In the previous experiment, we concluded that google/madlad400–3b-mt is the best choice for running machine translation locally. The newly released Qwen3 “supports 100+ languages and dialects.” The Qwen series has consistently demonstrated strong performance in translating between resource-rich languages like English and Chinese. This time, we’re testing its support for the low-resource language Finnish.

Using the same test data as before, here are Qwen3’s inference code and prompt:

from autodevice import AutoModelForCausalLM, get_device
from eval_fi_cn import benchmark
import time
from transformers import AutoModelForCausalLM, AutoTokenizer

device = get_device()
model_name = "Qwen/Qwen3-4B"
base_prompt = "Translate the following text from Finnish into English."

print(f"Using model '{model_name}' with prompt '{base_prompt}'")
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    #device_map=device
)
model = model.to(device)

def generateBatch(batch):
    prompts=[]
    for i, text in enumerate(batch, start=1):
        print(f'\033[94m{i}. {text}\033[0m')
        # We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
        messages = [
            {"role": "user", "content":
                f"{base_prompt}\n{text}."
            },
        ]
        prompt =…

Best LLM for 🇫🇮Finnish (low-resource languages) Translation?

Create an account to read the full story.

Written by Wei Lu

No responses yet

More from Wei Lu

Local DeepSeek-R1 671B on $800 configurations

Update on April 6,2025: Llama 4 has been released, and along with DeepSeek V3/R1, it marks a step into the top-tier large model platform…

Convert PDF to text (markdown) with SmolDocling

SmolDocling is a vision-to-text model with only 256M parameters, so its inference resource requirements should be much lower than olmOCR…

Convert PDF to text (markdown) with olmOCR on Windows Mini PC with Intel Core Ultra i5

olmOCR is a Qwen2-VL 7B model fine-tuned with academic papers, technical documentation, and other reference content, as well as a toolkit…

Whatsapp on Windows Server 2022

Recently, there has been a need to run multiple accounts simultaneously via Remote Desktop (RDP), but this is restricted on both Windows…

Recommended from Medium

How I Created Stunning AI Videos Locally, Free, Offline, and Uncensored

From static photos to fully animated scenes, here’s how I turned my laptop into a cinematic AI video studio, and how you can too.

MindsDB: The Only MCP Server You’ll Ever Need

Last year, I was buried in a pile of customer feedback

Forget LLMs, It’s Time For Large Concept Models (LCMs)

In LCM, modeling is performed in a high-dimensional embedding space instead of on a discrete token representation.

Qwen3 🇨🇳 from Alibaba : How to deploy this very communist but smart model 100% Locally?

While it could be considered one of the best open source models I have ever tried, it’s a clearly biased in favor of the dictator of China…

AI Agents in 5 Levels of Difficulty (With Full Code Implementation)

About two weeks before a big product deadline, my prototype agent broke in the worst way.

Windsurf just destroyed AI coding agents with something far far better

Wow this is incredible.