#machinetranslation — Public Fediverse posts

Eggs now in different baskets. @[email protected] · 2026-05-04 · 05:53 UTC

This week is going to be a bit busy and a bit tough for me.

Not in absolute terms but in terms of what I can deal with at the moment.

For everyone out there with a similar week ahead of them - "hou je taai!".

Aside - I just discovered that many online (AI powered?) translators mis-translate that Dutch phrase!

Sigh.

Time for some more coffee.

Eggs now in different baskets. @[email protected] · 2026-05-04 · 05:53 UTC

This week is going to be a bit busy and a bit tough for me.

Not in absolute terms but in terms of what I can deal with at the moment.

For everyone out there with a similar week ahead of them - "hou je taai!".

Aside - I just discovered that many online (AI powered?) translators mis-translate that Dutch phrase!

Sigh.

Time for some more coffee.

Eggs now in different baskets. @[email protected] · 2026-05-04 · 05:53 UTC

This week is going to be a bit busy and a bit tough for me.

Not in absolute terms but in terms of what I can deal with at the moment.

For everyone out there with a similar week ahead of them - "hou je taai!".

Aside - I just discovered that many online (AI powered?) translators mis-translate that Dutch phrase!

Sigh.

Time for some more coffee.

#machinetranslation #houjetaai #monday

Eggs now in different baskets. @[email protected] · 2026-05-04 · 05:53 UTC

This week is going to be a bit busy and a bit tough for me.

Not in absolute terms but in terms of what I can deal with at the moment.

For everyone out there with a similar week ahead of them - "hou je taai!".

Aside - I just discovered that many online (AI powered?) translators mis-translate that Dutch phrase!

Sigh.

Time for some more coffee.

Loek van Kooten, MA @[email protected] · 2026-05-01 · 14:12 UTC

And it shows the reasoning under the segment so the translator can argue with it. Both the pro-AI and the anti-AI camps in localization disagree with us. Translators don't. Read it: https://www.c4ttitude.com/blog/vethric-won-a-marathon-what-happens-when-you-let-ai-translate-without-the-lore/ What's the wildest thing your MT engine has produced without context? #Localization #MachineTranslation #CATtools #Translation #AItranslation

#localization #machinetranslation #cattools #translation #aitranslation

Loek van Kooten, MA @[email protected] · 2026-05-01 · 14:12 UTC

And it shows the reasoning under the segment so the translator can argue with it. Both the pro-AI and the anti-AI camps in localization disagree with us. Translators don't. Read it: https://www.c4ttitude.com/blog/vethric-won-a-marathon-what-happens-when-you-let-ai-translate-without-the-lore/ What's the wildest thing your MT engine has produced without context? #Localization #MachineTranslation #CATtools #Translation #AItranslation

#localization #machinetranslation #cattools #translation #aitranslation

Loek van Kooten, MA @[email protected] · 2026-05-01 · 14:12 UTC

And it shows the reasoning under the segment so the translator can argue with it. Both the pro-AI and the anti-AI camps in localization disagree with us. Translators don't. Read it: https://www.c4ttitude.com/blog/vethric-won-a-marathon-what-happens-when-you-let-ai-translate-without-the-lore/ What's the wildest thing your MT engine has produced without context? #Localization #MachineTranslation #CATtools #Translation #AItranslation

#localization #machinetranslation #cattools #translation #aitranslation

Loek van Kooten, MA @[email protected] · 2026-05-01 · 14:12 UTC

And it shows the reasoning under the segment so the translator can argue with it. Both the pro-AI and the anti-AI camps in localization disagree with us. Translators don't. Read it: https://www.c4ttitude.com/blog/vethric-won-a-marathon-what-happens-when-you-let-ai-translate-without-the-lore/ What's the wildest thing your MT engine has produced without context? #Localization #MachineTranslation #CATtools #Translation #AItranslation

#aitranslation #translation #cattools #machinetranslation #localization

Loek van Kooten, MA @[email protected] · 2026-05-01 · 14:12 UTC

And it shows the reasoning under the segment so the translator can argue with it. Both the pro-AI and the anti-AI camps in localization disagree with us. Translators don't. Read it: https://www.c4ttitude.com/blog/vethric-won-a-marathon-what-happens-when-you-let-ai-translate-without-the-lore/ What's the wildest thing your MT engine has produced without context? #Localization #MachineTranslation #CATtools #Translation #AItranslation

#localization #machinetranslation #cattools #translation #aitranslation

juergen_hubert @[email protected] · 2026-04-20 · 06:25 UTC

Why I refuse to use Machine Translation

In the last few years, there has been a lot of talk about how artificial intelligence (actually: commercial chatbots and LLMs) will be transforming our way of working – how it will make some jobs more efficient, and others obsolete. There are also concerns that such systems do not live up to the hype – though this has not stopped CEO and their consultants from pushing them into the workplace, in the hopes of drastically reducing their work force and labor costs even though they cannot substitute for their workers’ process knowledge.

I translate old German folk tales into English, and translation work is already heavily automated these days due to the sheer amount of material that needs to be translated. Thus, it is unsurprising that many people have asked me whether I use machine translation for my work – usually with the assumption that this would save me time.

In this essay, I am going to tell you why I won’t use AI systems for my translation work. I could talk about the ethical concerns – how the work of others is used to train LLM systems without compensation while charging for their output, or how they consume massive amounts of electricity and other resources while our planet and its ecosystems are already on the precipice, or how they are used to build up the mother of all investment bubbles.

I could also add some personal grievances. For instance, in my day job as a bid manager, I also have to price server systems for our customers, and when I recently noticed that a simple 16 GB DDR5 RAM module had a purchase price of €1,600, I realized that something is going very wrong indeed. Furthermore, anonymous bot networks are constantly scraping my websites for LLM training data, forcing me to upgrade my website hosting plan twice last fall to keep outages at a tolerable level.

But since others have elaborated on the ethical concerns in much more detail than I ever could, I won’t be talking about these further. Instead, I will be discussing the practical reasons why machine translation does not fit into my working processes when translating German folk tales.

Reading the Fraktur Typeset

The first challenge for machine translation is parsing the source material. For copyright reasons, I exclusively use public domain works – German folk tale collections which were largely published in the 19th century. And the vast majority of these works were not printed with the modern Antiqua letters, but the old German Fraktur typeset. Here is a reasonably “clean” example of a story I have translated (the source page is here):

Usually, texts that are converted into a new language by machine translation are already in a machine-readable format – but these old digital scans are not. Thus, before I could use machine translations for these texts, I would need to convert them into a machine-readable format. While OCR (“Optical Character Recognition”) tools exist that can handle Fraktur typesets, the output would require additional effort for proofreading, especially since the input data is highly variable in its quality.

Thus, in contrast to the original premise, machine translation would actually increase my workload even before I got to the actual translation step.

Translating Old Words and Phrases

LLM systems are largely trained on the most commonly available modern texts (such as Reddit posts). 19th century German folk tales are not “modern texts”. They are rife with old words and phrases that were only used in some small geographical area and are no longer in modern use. Would a standard machine translation system (i.e., one trained on Reddit) come up with a decent translation for “Bindelbaum” – to pick just one example that stuck in my mind? Especially considering that the old texts that could provide some context were not in a machine-readable format, and thus of limited use for training the LLMs?

Perhaps they could, and perhaps they couldn’t. However, “maybe this is an accurate translation” is not good enough for my purposes, and indeed, it is not sufficient for any professional translator. If I provide a translation for certain old words and prices, I need to be as sure as possible that this translation is accurate – and if I am uncertain, I need to explain that to my readers as well.

Thus, I would have to double-check every machine-translated text I work with with my own research – which, again, would not save me any time. And if I am doing all the research anyway, I might as well skip the machine translation and do it all by myself in the first place.

Providing Context

But truth to be told, the actual translation is the easiest part of my work. German folk tales were told in a specific time and a specific cultural context. The original audience for these tales (mostly 19th century German peasants) were deeply familiar with this context.

A modern audience will usually not be familiar with this context. Many aspects of these folk tales are hard to grasp even for modern Germans – so what chance does an international audience have?

This is why one of my most important tasks as a translator is to explain this context. This is why my books have many hundreds of footnotes, and explanatory commentary following each tale. While I am not primarily writing my books as scientific treatises, I have spent enough years in academia that I have views on providing inaccurate information. Sure, mistakes can and will happen. But allowing errors to proliferate in my manuscripts because I was outsourcing the most critical aspects of my research to LLM systems would be a gross violation of ethical standards (not that this seems to stop a lot of LLM users…).

So I will do my research the proper way. And with each paragraph I translate, I contemplate its hidden meanings and context, and how to convey it to my readers. But if I don’t do the first step of the work myself – that is, translating and thinking about every single sentence – then I have already lost my first opportunity to truly understand the story.

Preserving Unique Voices

German folk tales were told by tens of thousands of people, each of whom had their own unique way of telling their stories. And later on, they were collected by hundreds of folklore researchers, each of whom had their own unique editorial approach. That adds up to a lot of unique voices.

However, LLMs are well-known to generate texts that trend towards the average. They have been trained on vast archives of human-written texts, and their task is to create texts that are “most likely” to fit the prompt – the common denominator, if you will. Worse, it will be the most common denominator of Reddit users and the like. The only LLM system that might even come even close to capturing the unique voices of the original texts would be one that has been trained exclusively on their translations – including my translations.

While I want people to be entertained by my translations, these tales are also part of my country’s cultural heritage. Not even trying to capture the unique voices of these long-ago storytellers and instead replacing them with the generic output of LLMs feels hugely disrespectful.

They deserve better, and my audience deserves better as well.

#llm #machinetranslation #translation

juergen_hubert @[email protected] · 2026-04-20 · 06:25 UTC

Why I refuse to use Machine Translation

In the last few years, there has been a lot of talk about how artificial intelligence (actually: commercial chatbots and LLMs) will be transforming our way of working – how it will make some jobs more efficient, and others obsolete. There are also concerns that such systems do not live up to the hype – though this has not stopped CEO and their consultants from pushing them into the workplace, in the hopes of drastically reducing their work force and labor costs even though they cannot substitute for their workers’ process knowledge.

I translate old German folk tales into English, and translation work is already heavily automated these days due to the sheer amount of material that needs to be translated. Thus, it is unsurprising that many people have asked me whether I use machine translation for my work – usually with the assumption that this would save me time.

In this essay, I am going to tell you why I won’t use AI systems for my translation work. I could talk about the ethical concerns – how the work of others is used to train LLM systems without compensation while charging for their output, or how they consume massive amounts of electricity and other resources while our planet and its ecosystems are already on the precipice, or how they are used to build up the mother of all investment bubbles.

I could also add some personal grievances. For instance, in my day job as a bid manager, I also have to price server systems for our customers, and when I recently noticed that a simple 16 GB DDR5 RAM module had a purchase price of €1,600, I realized that something is going very wrong indeed. Furthermore, anonymous bot networks are constantly scraping my websites for LLM training data, forcing me to upgrade my website hosting plan twice last fall to keep outages at a tolerable level.

But since others have elaborated on the ethical concerns in much more detail than I ever could, I won’t be talking about these further. Instead, I will be discussing the practical reasons why machine translation does not fit into my working processes when translating German folk tales.

Reading the Fraktur Typeset

The first challenge for machine translation is parsing the source material. For copyright reasons, I exclusively use public domain works – German folk tale collections which were largely published in the 19th century. And the vast majority of these works were not printed with the modern Antiqua letters, but the old German Fraktur typeset. Here is a reasonably “clean” example of a story I have translated (the source page is here):

Usually, texts that are converted into a new language by machine translation are already in a machine-readable format – but these old digital scans are not. Thus, before I could use machine translations for these texts, I would need to convert them into a machine-readable format. While OCR (“Optical Character Recognition”) tools exist that can handle Fraktur typesets, the output would require additional effort for proofreading, especially since the input data is highly variable in its quality.

Thus, in contrast to the original premise, machine translation would actually increase my workload even before I got to the actual translation step.

Translating Old Words and Phrases

LLM systems are largely trained on the most commonly available modern texts (such as Reddit posts). 19th century German folk tales are not “modern texts”. They are rife with old words and phrases that were only used in some small geographical area and are no longer in modern use. Would a standard machine translation system (i.e., one trained on Reddit) come up with a decent translation for “Bindelbaum” – to pick just one example that stuck in my mind? Especially considering that the old texts that could provide some context were not in a machine-readable format, and thus of limited use for training the LLMs?

Perhaps they could, and perhaps they couldn’t. However, “maybe this is an accurate translation” is not good enough for my purposes, and indeed, it is not sufficient for any professional translator. If I provide a translation for certain old words and prices, I need to be as sure as possible that this translation is accurate – and if I am uncertain, I need to explain that to my readers as well.

Thus, I would have to double-check every machine-translated text I work with with my own research – which, again, would not save me any time. And if I am doing all the research anyway, I might as well skip the machine translation and do it all by myself in the first place.

Providing Context

But truth to be told, the actual translation is the easiest part of my work. German folk tales were told in a specific time and a specific cultural context. The original audience for these tales (mostly 19th century German peasants) were deeply familiar with this context.

A modern audience will usually not be familiar with this context. Many aspects of these folk tales are hard to grasp even for modern Germans – so what chance does an international audience have?

This is why one of my most important tasks as a translator is to explain this context. This is why my books have many hundreds of footnotes, and explanatory commentary following each tale. While I am not primarily writing my books as scientific treatises, I have spent enough years in academia that I have views on providing inaccurate information. Sure, mistakes can and will happen. But allowing errors to proliferate in my manuscripts because I was outsourcing the most critical aspects of my research to LLM systems would be a gross violation of ethical standards (not that this seems to stop a lot of LLM users…).

So I will do my research the proper way. And with each paragraph I translate, I contemplate its hidden meanings and context, and how to convey it to my readers. But if I don’t do the first step of the work myself – that is, translating and thinking about every single sentence – then I have already lost my first opportunity to truly understand the story.

Preserving Unique Voices

German folk tales were told by tens of thousands of people, each of whom had their own unique way of telling their stories. And later on, they were collected by hundreds of folklore researchers, each of whom had their own unique editorial approach. That adds up to a lot of unique voices.

However, LLMs are well-known to generate texts that trend towards the average. They have been trained on vast archives of human-written texts, and their task is to create texts that are “most likely” to fit the prompt – the common denominator, if you will. Worse, it will be the most common denominator of Reddit users and the like. The only LLM system that might even come even close to capturing the unique voices of the original texts would be one that has been trained exclusively on their translations – including my translations.

While I want people to be entertained by my translations, these tales are also part of my country’s cultural heritage. Not even trying to capture the unique voices of these long-ago storytellers and instead replacing them with the generic output of LLMs feels hugely disrespectful.

They deserve better, and my audience deserves better as well.

#llm #machinetranslation #translation

juergen_hubert @[email protected] · 2026-04-20 · 06:25 UTC

Why I refuse to use Machine Translation

In the last few years, there has been a lot of talk about how artificial intelligence (actually: commercial chatbots and LLMs) will be transforming our way of working – how it will make some jobs more efficient, and others obsolete. There are also concerns that such systems do not live up to the hype – though this has not stopped CEO and their consultants from pushing them into the workplace, in the hopes of drastically reducing their work force and labor costs even though they cannot substitute for their workers’ process knowledge.

I translate old German folk tales into English, and translation work is already heavily automated these days due to the sheer amount of material that needs to be translated. Thus, it is unsurprising that many people have asked me whether I use machine translation for my work – usually with the assumption that this would save me time.

In this essay, I am going to tell you why I won’t use AI systems for my translation work. I could talk about the ethical concerns – how the work of others is used to train LLM systems without compensation while charging for their output, or how they consume massive amounts of electricity and other resources while our planet and its ecosystems are already on the precipice, or how they are used to build up the mother of all investment bubbles.

I could also add some personal grievances. For instance, in my day job as a bid manager, I also have to price server systems for our customers, and when I recently noticed that a simple 16 GB DDR5 RAM module had a purchase price of €1,600, I realized that something is going very wrong indeed. Furthermore, anonymous bot networks are constantly scraping my websites for LLM training data, forcing me to upgrade my website hosting plan twice last fall to keep outages at a tolerable level.

But since others have elaborated on the ethical concerns in much more detail than I ever could, I won’t be talking about these further. Instead, I will be discussing the practical reasons why machine translation does not fit into my working processes when translating German folk tales.

Reading the Fraktur Typeset

The first challenge for machine translation is parsing the source material. For copyright reasons, I exclusively use public domain works – German folk tale collections which were largely published in the 19th century. And the vast majority of these works were not printed with the modern Antiqua letters, but the old German Fraktur typeset. Here is a reasonably “clean” example of a story I have translated (the source page is here):

Usually, texts that are converted into a new language by machine translation are already in a machine-readable format – but these old digital scans are not. Thus, before I could use machine translations for these texts, I would need to convert them into a machine-readable format. While OCR (“Optical Character Recognition”) tools exist that can handle Fraktur typesets, the output would require additional effort for proofreading, especially since the input data is highly variable in its quality.

Thus, in contrast to the original premise, machine translation would actually increase my workload even before I got to the actual translation step.

Translating Old Words and Phrases

LLM systems are largely trained on the most commonly available modern texts (such as Reddit posts). 19th century German folk tales are not “modern texts”. They are rife with old words and phrases that were only used in some small geographical area and are no longer in modern use. Would a standard machine translation system (i.e., one trained on Reddit) come up with a decent translation for “Bindelbaum” – to pick just one example that stuck in my mind? Especially considering that the old texts that could provide some context were not in a machine-readable format, and thus of limited use for training the LLMs?

Perhaps they could, and perhaps they couldn’t. However, “maybe this is an accurate translation” is not good enough for my purposes, and indeed, it is not sufficient for any professional translator. If I provide a translation for certain old words and prices, I need to be as sure as possible that this translation is accurate – and if I am uncertain, I need to explain that to my readers as well.

Thus, I would have to double-check every machine-translated text I work with with my own research – which, again, would not save me any time. And if I am doing all the research anyway, I might as well skip the machine translation and do it all by myself in the first place.

Providing Context

But truth to be told, the actual translation is the easiest part of my work. German folk tales were told in a specific time and a specific cultural context. The original audience for these tales (mostly 19th century German peasants) were deeply familiar with this context.

A modern audience will usually not be familiar with this context. Many aspects of these folk tales are hard to grasp even for modern Germans – so what chance does an international audience have?

This is why one of my most important tasks as a translator is to explain this context. This is why my books have many hundreds of footnotes, and explanatory commentary following each tale. While I am not primarily writing my books as scientific treatises, I have spent enough years in academia that I have views on providing inaccurate information. Sure, mistakes can and will happen. But allowing errors to proliferate in my manuscripts because I was outsourcing the most critical aspects of my research to LLM systems would be a gross violation of ethical standards (not that this seems to stop a lot of LLM users…).

So I will do my research the proper way. And with each paragraph I translate, I contemplate its hidden meanings and context, and how to convey it to my readers. But if I don’t do the first step of the work myself – that is, translating and thinking about every single sentence – then I have already lost my first opportunity to truly understand the story.

Preserving Unique Voices

German folk tales were told by tens of thousands of people, each of whom had their own unique way of telling their stories. And later on, they were collected by hundreds of folklore researchers, each of whom had their own unique editorial approach. That adds up to a lot of unique voices.

However, LLMs are well-known to generate texts that trend towards the average. They have been trained on vast archives of human-written texts, and their task is to create texts that are “most likely” to fit the prompt – the common denominator, if you will. Worse, it will be the most common denominator of Reddit users and the like. The only LLM system that might even come even close to capturing the unique voices of the original texts would be one that has been trained exclusively on their translations – including my translations.

While I want people to be entertained by my translations, these tales are also part of my country’s cultural heritage. Not even trying to capture the unique voices of these long-ago storytellers and instead replacing them with the generic output of LLMs feels hugely disrespectful.

They deserve better, and my audience deserves better as well.

#llm #machinetranslation #translation

juergen_hubert @[email protected] · 2026-04-20 · 06:25 UTC

Why I refuse to use Machine Translation

In the last few years, there has been a lot of talk about how artificial intelligence (actually: commercial chatbots and LLMs) will be transforming our way of working – how it will make some jobs more efficient, and others obsolete. There are also concerns that such systems do not live up to the hype – though this has not stopped CEO and their consultants from pushing them into the workplace, in the hopes of drastically reducing their work force and labor costs even though they cannot substitute for their workers’ process knowledge.

I translate old German folk tales into English, and translation work is already heavily automated these days due to the sheer amount of material that needs to be translated. Thus, it is unsurprising that many people have asked me whether I use machine translation for my work – usually with the assumption that this would save me time.

In this essay, I am going to tell you why I won’t use AI systems for my translation work. I could talk about the ethical concerns – how the work of others is used to train LLM systems without compensation while charging for their output, or how they consume massive amounts of electricity and other resources while our planet and its ecosystems are already on the precipice, or how they are used to build up the mother of all investment bubbles.

I could also add some personal grievances. For instance, in my day job as a bid manager, I also have to price server systems for our customers, and when I recently noticed that a simple 16 GB DDR5 RAM module had a purchase price of €1,600, I realized that something is going very wrong indeed. Furthermore, anonymous bot networks are constantly scraping my websites for LLM training data, forcing me to upgrade my website hosting plan twice last fall to keep outages at a tolerable level.

But since others have elaborated on the ethical concerns in much more detail than I ever could, I won’t be talking about these further. Instead, I will be discussing the practical reasons why machine translation does not fit into my working processes when translating German folk tales.

Reading the Fraktur Typeset

The first challenge for machine translation is parsing the source material. For copyright reasons, I exclusively use public domain works – German folk tale collections which were largely published in the 19th century. And the vast majority of these works were not printed with the modern Antiqua letters, but the old German Fraktur typeset. Here is a reasonably “clean” example of a story I have translated (the source page is here):

Usually, texts that are converted into a new language by machine translation are already in a machine-readable format – but these old digital scans are not. Thus, before I could use machine translations for these texts, I would need to convert them into a machine-readable format. While OCR (“Optical Character Recognition”) tools exist that can handle Fraktur typesets, the output would require additional effort for proofreading, especially since the input data is highly variable in its quality.

Thus, in contrast to the original premise, machine translation would actually increase my workload even before I got to the actual translation step.

Translating Old Words and Phrases

LLM systems are largely trained on the most commonly available modern texts (such as Reddit posts). 19th century German folk tales are not “modern texts”. They are rife with old words and phrases that were only used in some small geographical area and are no longer in modern use. Would a standard machine translation system (i.e., one trained on Reddit) come up with a decent translation for “Bindelbaum” – to pick just one example that stuck in my mind? Especially considering that the old texts that could provide some context were not in a machine-readable format, and thus of limited use for training the LLMs?

Perhaps they could, and perhaps they couldn’t. However, “maybe this is an accurate translation” is not good enough for my purposes, and indeed, it is not sufficient for any professional translator. If I provide a translation for certain old words and prices, I need to be as sure as possible that this translation is accurate – and if I am uncertain, I need to explain that to my readers as well.

Thus, I would have to double-check every machine-translated text I work with with my own research – which, again, would not save me any time. And if I am doing all the research anyway, I might as well skip the machine translation and do it all by myself in the first place.

Providing Context

But truth to be told, the actual translation is the easiest part of my work. German folk tales were told in a specific time and a specific cultural context. The original audience for these tales (mostly 19th century German peasants) were deeply familiar with this context.

A modern audience will usually not be familiar with this context. Many aspects of these folk tales are hard to grasp even for modern Germans – so what chance does an international audience have?

This is why one of my most important tasks as a translator is to explain this context. This is why my books have many hundreds of footnotes, and explanatory commentary following each tale. While I am not primarily writing my books as scientific treatises, I have spent enough years in academia that I have views on providing inaccurate information. Sure, mistakes can and will happen. But allowing errors to proliferate in my manuscripts because I was outsourcing the most critical aspects of my research to LLM systems would be a gross violation of ethical standards (not that this seems to stop a lot of LLM users…).

So I will do my research the proper way. And with each paragraph I translate, I contemplate its hidden meanings and context, and how to convey it to my readers. But if I don’t do the first step of the work myself – that is, translating and thinking about every single sentence – then I have already lost my first opportunity to truly understand the story.

Preserving Unique Voices

German folk tales were told by tens of thousands of people, each of whom had their own unique way of telling their stories. And later on, they were collected by hundreds of folklore researchers, each of whom had their own unique editorial approach. That adds up to a lot of unique voices.

However, LLMs are well-known to generate texts that trend towards the average. They have been trained on vast archives of human-written texts, and their task is to create texts that are “most likely” to fit the prompt – the common denominator, if you will. Worse, it will be the most common denominator of Reddit users and the like. The only LLM system that might even come even close to capturing the unique voices of the original texts would be one that has been trained exclusively on their translations – including my translations.

While I want people to be entertained by my translations, these tales are also part of my country’s cultural heritage. Not even trying to capture the unique voices of these long-ago storytellers and instead replacing them with the generic output of LLMs feels hugely disrespectful.

They deserve better, and my audience deserves better as well.

#translation #machinetranslation #llm

juergen_hubert @[email protected] · 2026-04-20 · 06:25 UTC

Why I refuse to use Machine Translation

In the last few years, there has been a lot of talk about how artificial intelligence (actually: commercial chatbots and LLMs) will be transforming our way of working – how it will make some jobs more efficient, and others obsolete. There are also concerns that such systems do not live up to the hype – though this has not stopped CEO and their consultants from pushing them into the workplace, in the hopes of drastically reducing their work force and labor costs even though they cannot substitute for their workers’ process knowledge.

I translate old German folk tales into English, and translation work is already heavily automated these days due to the sheer amount of material that needs to be translated. Thus, it is unsurprising that many people have asked me whether I use machine translation for my work – usually with the assumption that this would save me time.

In this essay, I am going to tell you why I won’t use AI systems for my translation work. I could talk about the ethical concerns – how the work of others is used to train LLM systems without compensation while charging for their output, or how they consume massive amounts of electricity and other resources while our planet and its ecosystems are already on the precipice, or how they are used to build up the mother of all investment bubbles.

I could also add some personal grievances. For instance, in my day job as a bid manager, I also have to price server systems for our customers, and when I recently noticed that a simple 16 GB DDR5 RAM module had a purchase price of €1,600, I realized that something is going very wrong indeed. Furthermore, anonymous bot networks are constantly scraping my websites for LLM training data, forcing me to upgrade my website hosting plan twice last fall to keep outages at a tolerable level.

But since others have elaborated on the ethical concerns in much more detail than I ever could, I won’t be talking about these further. Instead, I will be discussing the practical reasons why machine translation does not fit into my working processes when translating German folk tales.

Reading the Fraktur Typeset

The first challenge for machine translation is parsing the source material. For copyright reasons, I exclusively use public domain works – German folk tale collections which were largely published in the 19th century. And the vast majority of these works were not printed with the modern Antiqua letters, but the old German Fraktur typeset. Here is a reasonably “clean” example of a story I have translated (the source page is here):

Usually, texts that are converted into a new language by machine translation are already in a machine-readable format – but these old digital scans are not. Thus, before I could use machine translations for these texts, I would need to convert them into a machine-readable format. While OCR (“Optical Character Recognition”) tools exist that can handle Fraktur typesets, the output would require additional effort for proofreading, especially since the input data is highly variable in its quality.

Thus, in contrast to the original premise, machine translation would actually increase my workload even before I got to the actual translation step.

Translating Old Words and Phrases

LLM systems are largely trained on the most commonly available modern texts (such as Reddit posts). 19th century German folk tales are not “modern texts”. They are rife with old words and phrases that were only used in some small geographical area and are no longer in modern use. Would a standard machine translation system (i.e., one trained on Reddit) come up with a decent translation for “Bindelbaum” – to pick just one example that stuck in my mind? Especially considering that the old texts that could provide some context were not in a machine-readable format, and thus of limited use for training the LLMs?

Perhaps they could, and perhaps they couldn’t. However, “maybe this is an accurate translation” is not good enough for my purposes, and indeed, it is not sufficient for any professional translator. If I provide a translation for certain old words and prices, I need to be as sure as possible that this translation is accurate – and if I am uncertain, I need to explain that to my readers as well.

Thus, I would have to double-check every machine-translated text I work with with my own research – which, again, would not save me any time. And if I am doing all the research anyway, I might as well skip the machine translation and do it all by myself in the first place.

Providing Context

But truth to be told, the actual translation is the easiest part of my work. German folk tales were told in a specific time and a specific cultural context. The original audience for these tales (mostly 19th century German peasants) were deeply familiar with this context.

A modern audience will usually not be familiar with this context. Many aspects of these folk tales are hard to grasp even for modern Germans – so what chance does an international audience have?

This is why one of my most important tasks as a translator is to explain this context. This is why my books have many hundreds of footnotes, and explanatory commentary following each tale. While I am not primarily writing my books as scientific treatises, I have spent enough years in academia that I have views on providing inaccurate information. Sure, mistakes can and will happen. But allowing errors to proliferate in my manuscripts because I was outsourcing the most critical aspects of my research to LLM systems would be a gross violation of ethical standards (not that this seems to stop a lot of LLM users…).

So I will do my research the proper way. And with each paragraph I translate, I contemplate its hidden meanings and context, and how to convey it to my readers. But if I don’t do the first step of the work myself – that is, translating and thinking about every single sentence – then I have already lost my first opportunity to truly understand the story.

Preserving Unique Voices

German folk tales were told by tens of thousands of people, each of whom had their own unique way of telling their stories. And later on, they were collected by hundreds of folklore researchers, each of whom had their own unique editorial approach. That adds up to a lot of unique voices.

However, LLMs are well-known to generate texts that trend towards the average. They have been trained on vast archives of human-written texts, and their task is to create texts that are “most likely” to fit the prompt – the common denominator, if you will. Worse, it will be the most common denominator of Reddit users and the like. The only LLM system that might even come even close to capturing the unique voices of the original texts would be one that has been trained exclusively on their translations – including my translations.

While I want people to be entertained by my translations, these tales are also part of my country’s cultural heritage. Not even trying to capture the unique voices of these long-ago storytellers and instead replacing them with the generic output of LLMs feels hugely disrespectful.

They deserve better, and my audience deserves better as well.

https://rbfirehose.com/2026/03/27/pcmag-google-brings-real-time-headphone-translation-to-ios/

#llm #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-27 · 11:59 UTC

PCMag: Google Brings Real-Time Headphone Translation to iOS. “The Gemini-powered feature made its Android debut in December and lets you hear translations in real time through your headphones. To give it a try, launch the Translate app, select the desired languages, connect your headphones, and tap the Live Translate button at the bottom left of the home page.”

#google #ios #iphone #livetranslation #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-27 · 11:59 UTC

PCMag: Google Brings Real-Time Headphone Translation to iOS. “The Gemini-powered feature made its Android debut in December and lets you hear translations in real time through your headphones. To give it a try, launch the Translate app, select the desired languages, connect your headphones, and tap the Live Translate button at the bottom left of the home page.”

https://rbfirehose.com/2026/03/27/pcmag-google-brings-real-time-headphone-translation-to-ios/

#google #ios #iphone #livetranslation #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-27 · 11:59 UTC

PCMag: Google Brings Real-Time Headphone Translation to iOS. “The Gemini-powered feature made its Android debut in December and lets you hear translations in real time through your headphones. To give it a try, launch the Translate app, select the desired languages, connect your headphones, and tap the Live Translate button at the bottom left of the home page.”

https://rbfirehose.com/2026/03/27/pcmag-google-brings-real-time-headphone-translation-to-ios/

#google #ios #iphone #livetranslation #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-27 · 11:59 UTC

PCMag: Google Brings Real-Time Headphone Translation to iOS. “The Gemini-powered feature made its Android debut in December and lets you hear translations in real time through your headphones. To give it a try, launch the Translate app, select the desired languages, connect your headphones, and tap the Live Translate button at the bottom left of the home page.”

https://rbfirehose.com/2026/03/27/pcmag-google-brings-real-time-headphone-translation-to-ios/

#translation #machinetranslation #livetranslation #iphone #ios #google

ResearchBuzz: Firehose @[email protected] · 2026-03-27 · 11:59 UTC

PCMag: Google Brings Real-Time Headphone Translation to iOS. “The Gemini-powered feature made its Android debut in December and lets you hear translations in real time through your headphones. To give it a try, launch the Translate app, select the desired languages, connect your headphones, and tap the Live Translate button at the bottom left of the home page.”

https://rbfirehose.com/2026/03/27/pcmag-google-brings-real-time-headphone-translation-to-ios/

#google #ios #iphone #livetranslation #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-23 · 21:13 UTC

Techdirt: Greater Than Zero: The Anti-AI Pushback On Gaming Preservation Efforts Makes No Sense. “I’m not some AI evangelist. I fully recognize that there are error and other problems with AI… and I imagine there always will be, to some extent. AI is not always, or perhaps even mostly, the right tool to use. Nor will it always have benefits that outweigh problems it creates for we human […]

https://rbfirehose.com/2026/03/23/greater-than-zero-the-anti-ai-pushback-on-gaming-preservation-efforts-makes-no-sense-techdirt/

#ai #aiassisted #anime #antiai #editorial #japan

ResearchBuzz: Firehose @[email protected] · 2026-03-23 · 21:13 UTC

Techdirt: Greater Than Zero: The Anti-AI Pushback On Gaming Preservation Efforts Makes No Sense. “I’m not some AI evangelist. I fully recognize that there are error and other problems with AI… and I imagine there always will be, to some extent. AI is not always, or perhaps even mostly, the right tool to use. Nor will it always have benefits that outweigh problems it creates for we human […]

https://rbfirehose.com/2026/03/23/greater-than-zero-the-anti-ai-pushback-on-gaming-preservation-efforts-makes-no-sense-techdirt/

#ai #aiassisted #anime #antiai #editorial #japan

ResearchBuzz: Firehose @[email protected] · 2026-03-23 · 21:13 UTC

Techdirt: Greater Than Zero: The Anti-AI Pushback On Gaming Preservation Efforts Makes No Sense. “I’m not some AI evangelist. I fully recognize that there are error and other problems with AI… and I imagine there always will be, to some extent. AI is not always, or perhaps even mostly, the right tool to use. Nor will it always have benefits that outweigh problems it creates for we human […]

https://rbfirehose.com/2026/03/23/greater-than-zero-the-anti-ai-pushback-on-gaming-preservation-efforts-makes-no-sense-techdirt/

#ai #aiassisted #anime #antiai #editorial #japan

ResearchBuzz: Firehose @[email protected] · 2026-03-23 · 21:13 UTC

Techdirt: Greater Than Zero: The Anti-AI Pushback On Gaming Preservation Efforts Makes No Sense. “I’m not some AI evangelist. I fully recognize that there are error and other problems with AI… and I imagine there always will be, to some extent. AI is not always, or perhaps even mostly, the right tool to use. Nor will it always have benefits that outweigh problems it creates for we human […]

https://rbfirehose.com/2026/03/23/greater-than-zero-the-anti-ai-pushback-on-gaming-preservation-efforts-makes-no-sense-techdirt/

#translation #opinion #machinetranslation #japanculture #japan #editorial

ResearchBuzz: Firehose @[email protected] · 2026-03-23 · 21:13 UTC

Techdirt: Greater Than Zero: The Anti-AI Pushback On Gaming Preservation Efforts Makes No Sense. “I’m not some AI evangelist. I fully recognize that there are error and other problems with AI… and I imagine there always will be, to some extent. AI is not always, or perhaps even mostly, the right tool to use. Nor will it always have benefits that outweigh problems it creates for we human […]

https://rbfirehose.com/2026/03/23/greater-than-zero-the-anti-ai-pushback-on-gaming-preservation-efforts-makes-no-sense-techdirt/

#ai #aiassisted #anime #antiai #editorial #japan

ResearchBuzz: Firehose @[email protected] · 2026-03-20 · 11:04 UTC

Ars Technica: Kagi Translate’s AI answers the question “What would horny Margaret Thatcher say?”. “This week, many people across the Internet have been bemused to find that the AI-powered Kagi Translate can perform these and countless other unlikely ‘translation’ tasks. And while the collective discovery highlights the playful, creative side of large language models, it also exposes the […]

https://rbfirehose.com/2026/03/20/ars-technica-kagi-translates-ai-answers-the-question-what-would-horny-margaret-thatcher-say/

#ai #aiassisted #kagi #languages #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-20 · 11:04 UTC

Ars Technica: Kagi Translate’s AI answers the question “What would horny Margaret Thatcher say?”. “This week, many people across the Internet have been bemused to find that the AI-powered Kagi Translate can perform these and countless other unlikely ‘translation’ tasks. And while the collective discovery highlights the playful, creative side of large language models, it also exposes the […]

https://rbfirehose.com/2026/03/20/ars-technica-kagi-translates-ai-answers-the-question-what-would-horny-margaret-thatcher-say/

#ai #aiassisted #kagi #languages #machinetranslation #translation

ResearchBuzz: Firehose @[email protected] · 2026-03-20 · 11:04 UTC

Ars Technica: Kagi Translate’s AI answers the question “What would horny Margaret Thatcher say?”. “This week, many people across the Internet have been bemused to find that the AI-powered Kagi Translate can perform these and countless other unlikely ‘translation’ tasks. And while the collective discovery highlights the playful, creative side of large language models, it also exposes the […]

https://rbfirehose.com/2026/03/20/ars-technica-kagi-translates-ai-answers-the-question-what-would-horny-margaret-thatcher-say/

#translation #machinetranslation #languages #kagi #aiassisted #ai

ResearchBuzz: Firehose @[email protected] · 2026-03-20 · 11:04 UTC

Ars Technica: Kagi Translate’s AI answers the question “What would horny Margaret Thatcher say?”. “This week, many people across the Internet have been bemused to find that the AI-powered Kagi Translate can perform these and countless other unlikely ‘translation’ tasks. And while the collective discovery highlights the playful, creative side of large language models, it also exposes the […]

https://rbfirehose.com/2026/03/20/ars-technica-kagi-translates-ai-answers-the-question-what-would-horny-margaret-thatcher-say/

#ai #aiassisted #kagi #languages #machinetranslation #translation

sz_duras - text @[email protected] · 2026-03-19 · 18:37 UTC

March with Heinrich Von Kleist by Heissenbuttel - https://sz-duras.medium.com/march-with-heinrich-von-kleist-heissenbuttel-a13c0614ca12. #machinetranslation #translation

#machinetranslation #translation

sz_duras - text @[email protected] · 2026-03-19 · 18:37 UTC

March with Heinrich Von Kleist by Heissenbuttel - https://sz-duras.medium.com/march-with-heinrich-von-kleist-heissenbuttel-a13c0614ca12. #machinetranslation #translation

#translation #machinetranslation

sz_duras - text @[email protected] · 2026-03-19 · 18:37 UTC

March with Heinrich Von Kleist by Heissenbuttel - https://sz-duras.medium.com/march-with-heinrich-von-kleist-heissenbuttel-a13c0614ca12. #machinetranslation #translation

#machinetranslation #translation

Till Grallert @[email protected] · 2026-03-15 · 17:04 UTC

Ich hasse maschinell übersetzte Supportwebseiten großer Unternehmen! Wie sollen deutschsprachige Kund_innen von Osprey bei der Wahrnehmung von deren “all mighty guarantee” denn herausbekommen, dass mit “Ausstellungsort” eigentlich “Problemstelle am Produkt” und mit "Name des Pakets” eigentlich der “Produktname des Rucksacks” gemeint ist?

#machinetranslation #customerrelationsfail

Till Grallert @[email protected] · 2026-03-15 · 17:04 UTC

Ich hasse maschinell übersetzte Supportwebseiten großer Unternehmen! Wie sollen deutschsprachige Kund_innen von Osprey bei der Wahrnehmung von deren “all mighty guarantee” denn herausbekommen, dass mit “Ausstellungsort” eigentlich “Problemstelle am Produkt” und mit "Name des Pakets” eigentlich der “Produktname des Rucksacks” gemeint ist?

#machinetranslation #customerrelationsfail

Till Grallert @[email protected] · 2026-03-15 · 17:04 UTC

Ich hasse maschinell übersetzte Supportwebseiten großer Unternehmen! Wie sollen deutschsprachige Kund_innen von Osprey bei der Wahrnehmung von deren “all mighty guarantee” denn herausbekommen, dass mit “Ausstellungsort” eigentlich “Problemstelle am Produkt” und mit "Name des Pakets” eigentlich der “Produktname des Rucksacks” gemeint ist?

#machinetranslation #customerrelationsfail

Till Grallert @[email protected] · 2026-03-15 · 17:04 UTC

Ich hasse maschinell übersetzte Supportwebseiten großer Unternehmen! Wie sollen deutschsprachige Kund_innen von Osprey bei der Wahrnehmung von deren “all mighty guarantee” denn herausbekommen, dass mit “Ausstellungsort” eigentlich “Problemstelle am Produkt” und mit "Name des Pakets” eigentlich der “Produktname des Rucksacks” gemeint ist?

#customerrelationsfail #machinetranslation

Till Grallert @[email protected] · 2026-03-15 · 17:04 UTC

Ich hasse maschinell übersetzte Supportwebseiten großer Unternehmen! Wie sollen deutschsprachige Kund_innen von Osprey bei der Wahrnehmung von deren “all mighty guarantee” denn herausbekommen, dass mit “Ausstellungsort” eigentlich “Problemstelle am Produkt” und mit "Name des Pakets” eigentlich der “Produktname des Rucksacks” gemeint ist?