Many large language models (LLMs) are trained on data that reflects Western perspectives and the English language. But what does that mean for users in Southeast Asia, where over 1,200 languages are spoken?
In this video, Elina Noor and Binya Kanitroj explore how AI models like ChatGPT, Gemini, and Claude struggle with cultural understanding, historical narratives, and identity. These blind spots aren't just small errors—they reflect a deeper issue of cultural representation in AI.
Now, developers across Southeast Asia are pushing back. They’re building AI models in languages like Thai, Malay, Indonesian, and Khmer to democratize AI access and challenge the dominance of Western and Chinese models. But they face tough choices: should they train new models from scratch or fine-tune existing ones? And how do they navigate the growing geopolitical competition over AI?