I created this AI architecture based on a scientific paper. The main idea is simple: give the computer two datasets of different languages (for example, Italian and English) and let it learn how to translate between them.
So, what’s different about this from regular translation models like Marian or seq2seq?
The key difference is that it doesn’t need direct translations in the dataset. This is a big deal for languages that don’t have a lot of available translation data.
How does it work?
The model takes sentences from one language—let’s say Italian—and tries to translate them into another language, like English. It uses a metric called the BLEU score to check if the translated sentence makes sense. Over time, this encourages the model to improve its translations. Then, the English translation is translated back into Italian, and the model gets rewarded if the back translation matches the original sentence.
For example:
Il gatto è sulla sedia → The cat is on the chair → Il gatto è sulla sedia
While this method currently gives lower results than traditional models, it could be improved. It might open up new possibilities for translation, especially for languages with limited resources.