remix logo

Hacker Remix

Brood War Korean Translations

220 points by todsacerdoti 9 hours ago | 74 comments

jaeyounkg 8 hours ago

This was an fun read, as someone who's both a Korean BW player and a speech recognition researcher.

It's interesting to note that the original Korean transcription already has many errors, seemingly (and impressively) corrected by LLMs later on. For example, 12 안마당 빌드 (12 courtyard build) is actually 12 앞마당 빌드 (12 frontyard build), which might have been more understandable to BW players. Similarly 투에처리 빌드 (processing-at-two build? makes no sense lol) should have been transcribed 투해처리 빌드 (two-Hatchery build).

Therefore it may also be helpful to directly feed the slang dictionary into Whisper's inference process using contextual biasing. There are lots of ways to do this, but the simplest would be to increase the probability of slang words in the dictionary in the final prediction layer of Whisper by a constant factor. This is fairly easy to implement, for example by using HuggingFace's library: https://huggingface.co/docs/transformers/en/internal/generat...

woodson 3 hours ago

Thanks for the added context on the builds! As "foreign" BW player and fellow speech processing researcher, I agree shallow contextual biasing should help. While not difficult to implement, most generally available ASR solutions don't make it easy to use. There's a PR in ctranslate2 implementing the same feature so that it could be exposed in faster-whisper: https://github.com/OpenNMT/CTranslate2/pull/1789

chongli 5 hours ago

I am a StarCraft fan and I have no idea what a courtyard or a frontyard is supposed to be! However I do know that the names of buildings, units, technologies, and strategies are usually heavily abbreviated in English. Perhaps the same is true in Korean? A 12 barracks build would usually just be called "12 rax", a two hatchery mutalisk build would be called "2 hatch muta", and a three hatchery hydralisk timing attack / all-in would be called "3 hatch hydra bust".

rcthompson 5 hours ago

I believe the equivalent term used in English (exhibited in the new translation) is "natural", short for "natural expansion", which refers to the obvious location where the player should build their first expansion. It sounds like the term used in Korean for this concept literally means "front yard" rather than matching the English term.

Reason077 4 hours ago

Makes sense. And presumably the 12 means that you expand to your natural ("courtyard") with your 12th worker unit (probe, in the case of protoss).

sushid 3 hours ago

Not the parent commenter but not always. 9 pool just means you build a spawning pool at your main, for instance. This worker-prefix building build-order naming system also breaks down once people start referencing builds like 2 rax academy, 3 hatch muta, etc.

Reason077 3 hours ago

Right, "9 pool" means build a spawning pool when you have 9 workers. So "12 courtyard" means build an expansion when you have 12 workers.

thaumasiotes 3 hours ago

I think strictly "9 pool" means you build the pool when you have 9 supply. However, before you build a spawning pool, the only thing you can build that consumes supply is workers.

starcraftgamer 5 hours ago

A lot of Korean slang is a little different. Source: not Korean but have been in the English community a long time and picked some stuff up.

"1rax double" is equivalent to "1rax expand" or "1rax CC". They use multi or double to mean expand in the early game. Instead of "cheese" or "all-in" they use "pil-sal-gi" which means ace/joker card or "han-bang" which means an army or attack on few resources.

I am not sure what short-hand they use for barracks, gateway, etc.

chongli 4 hours ago

Instead of "cheese" or "all-in" they use "pil-sal-gi" which means ace/joker card

That’s a really interesting one to me! One thing I’ve noticed is that Koreans do not seem to have the same hangups / negative attitude towards cheese strategies as westerners do!

bee_rider 8 hours ago

Do they actually use the Korean word for, like, tossing something to refer to the Protoss? That’s a pretty funny cross-language pun if so.

asdasdsddd 7 hours ago

Half of the words in the Korean blurb are just romanizations. Even build is just bil-deu

sushid 3 hours ago

No, Protoss is just 토스, which is just hangulization of "Toss" aka Protoss.

jaeyounkg 8 hours ago

Haha, no I acutually never associated this with the English word toss lol.

diziet 32 minutes ago

I was able to understand the Google Translate version well, but I am very familiar with the intricacies of BW and zerg 12hatch openers.

Chatgpt and Claude did an incredible job translating the korean text:

Claude:

  Today I'll teach you about the 12 Hatchery build. I'll explain the types of 12 Hatchery builds, their advantages and disadvantages, and the build orders in a simple but detailed way.
  Against Protoss, this is the build you use when you want to start with the most economic advantage. Against Terran, there are several builds you can do with 12 Hatchery, so I'll explain some of the most commonly used builds.
  The first is the two-hatchery build that starts with 12 Hatchery:
  12 Hatchery
  11 Spawning Pool
  10 Gas
  This build uses early gas, and it's often used when you want to quickly transition into a three-hatchery build with three gas bases.
  The second build is:
  12 Hatchery
  12 Pool
  12 Gas
  This build allows for moderately fast tech tree and moderately fast three-hatchery expansion. This build is commonly known as the "safe three-hatchery" build, and you can think of it as a build that enables both quick Mutalisks and quick third base.

leshokunin 8 hours ago

Don’t let the title fool you: this is anextremely thorough and creative take on translating and making more approachable the commentary of StarCraft.

As the author rightly points out, in its 27 years of existence, commentary around the game has become a domain specific language. Not just Korean or English.

This approach of automated scripting and using AI to understand roughly what was said and then make it coherent is really cool.

jaimebuelta 5 hours ago

LOL, as a non-native English speaker, reading this reminds me of EXACTLY the same problem of translating many things, but more precisely, computer articles and software development.

There’s a huge amount of terms that are difficult to translate (sharding? Hash?). The only real solution is to adopt them to your language, more or less adapted, which is what happens over time. But it requires a community that, to some degree, is able to cross the gap between the languages. In this case, learning English.

Talking about software development in Spanish (my native language) is a succession of imported terms from English.

I don’t think there’s a good way of doing that, and I’m interested to see how automatic translations deal with it, because the only way this can work is with a process of mixing both language in a social way and see what terms evolve from that process.

And you need, in the terms the post describes, people that know Korean at least in a non-fluent way. And the game itself, of course.

jordigh 2 hours ago

With Spanish we have the added complexity that there are different linguistic traditions around the world. For example, in Mexico I learned "depurar", an existing Spanish word that closely fits the meaning of "debug". However, many Spanish speakers simply say "debuguear", just directly borrowing the English word. In Mexico I also learned "desempeño" to describe the performance of a computer or software, but in Argentina I've heard "el performance" to say the same.

I think the most common thing is to just use English loanwords without trying to find existing Spanish words that fit the meaning.

BlueTemplar 51 minutes ago

Why would sharding and hash be difficult to translate when they use metaphors that are easy to visualize in a "physical" context ?

MichaelDickens 36 minutes ago

I think the words' metaphorical meanings don't help much unless you already know what they mean. If you heard the word "sharding" for the first time and all you knew was that it had something to do with computers, I think you'd have a hard time guessing that it means "partitioning rows of a database across multiple servers to reduce load".