Korean Misclassified as Chinese Validator Fix | Generated by AI
Got it — in this debug output:
Debug: raw langdetect output: [('ko', 0.9999995211930662)]
...
Detected languages: [('ko', 0.9999995211930662)]
The code ko is the ISO 639-1 language code for Korean. 🇰🇷
So langdetect is saying:
- It thinks the text is Korean (
ko) - With probability ≈ 0.9999995 (essentially 100%)
That explains the error: your validator expected "zh-tw" (Traditional Chinese), but the detector decided the input looks like Korean instead.
👉 Quick mapping reminder from langdetect / ISO 639-1:
zh→ Chinese (unspecified)zh-cn→ Simplified Chinese (China)zh-tw(orhantin your mapping) → Traditional Chinese (Taiwan)ja→ Japaneseko→ Koreanne→ Nepalihi→ Hindi
Do you want me to suggest how you could harden your validator so it doesn’t fail when langdetect misclassifies (e.g. allowing a fallback if the target script characters are clearly Chinese but the code comes out as ko)?