Fw: Translation for time zone conversion runner

Natalie Clarius natalie_clarius at yahoo.de
Sun Feb 12 04:16:43 GMT 2023


 Hi Shinjo and Emir,

thanks for the feedback.

The following kinds of time zone names are recognized:
- city (eg "Berlin")
- international abbreviation (eg "CET")
- long name (eg "Central European Standard Time")
- short name (eg "GMT+1")
- offset name (eg "UTC+01:00")

As mentioned in my previous mail, the time zone names already are localized. For Korean, the code tells me that "중부유럽 표준시" would be recognized for CET. 

As for the syntax: We can not just write code that would parse real natural language queries for ~60 different languages. This is not a feasible thing to do in a runner plugin, and it was never meant to do that, not in the translated languages but not in English either. I can see how the current runner syntax gave that impression though so I'll try to explain how I intended it to work.

Runners use a simple syntax that gets triggered by keywords in combination with user input according to a pattern that is explained in the user help. The user help can be invoked by clicking on the "?" in KRunner and selecting from the plugin list, or typing "?" followed by the name of the plugin. Eg if you enter "?power" in KRunner, it will explain the query syntax for the power runner, amongst others that "screen brightness <percentage value>" can be used to set the screen brightness. The "<" ">" are just to indicate that this is a placeholder where the user is supposed to insert a value like "10" and not literally type in the words "percentage value". This is the common way of writing placeholders in manuals, and I was assuming you were familiar with it; sorry if that caused confusion.

So, for example, you can type "screen brightness 10" in the power runner to set the screen brightness, or "dolphin close" in the windows runner to close a Dolphin window, or "time berlin" to display the current time in Berlin. Neither of these are proper phrases in English either. It's a simple fixed pattern that will work the same independently of the user language, and what is translated are the individual parts of the query; e.g. in Turkish you set the screen brightness by typing "ekran parlaklığı 10". The idea is the same for the time zone conversion.

While we can make fixed known strings available for translation so that the translators are free to provide the exact word order and inflection for that particular string, we can not do the same for user input, which can be anything. We can not have an arbitrary order of the parts in the query, or automatic different query parsing strategies for each individual language. The runner needs some way of knowing where one part starts and the other ends. The runner is not smart, it does not and can not understand natural language.

What I can offer is to make the syntax even simpler. In a previous version, I had the input format as "<from-timezone> <time> <to-timezone>", e.g. "Berlin 8:00 UTC" to convert 08:00 Berlin time to UTC. It was suggested to change it to the "8:00 Berlin in UTC" style because, on the one hand, it was similar to what we already have in the unit conversion runner, and on the other hand, they (a native English speaker) considered it more intuitive. But I see now that this causes more problems than it solves, because it is pretending a complexity of understanding that isn't there, and won't work for languages that aren't English. So I am now leaning towards changing the input format to "<from-timezone> <time> <to-timezone>", which doesn't have any bells and whistles to cause wrong expectations and inconsistency between languages.

If, on the other hand, you say that both in Korean and in Turkish, a syntax like "<time> <from timezone> in <to timezone>" (where the time zone, the time zone names as well as a word for "in" are localized) is at least somewhat feasible, at any rate after the fact that we have (likewise translated) user help explaining how to use it, then I would say there is not much of a problem, and you are good to go by simply translating the "in" keyword and any strings that are used in the output formatting.

Let me know what you think or if there are any follow-up questions.
Natalie



   ----- Weitergeleitete Nachricht ----- Von: Emir SARI <emir_sari at icloud.com>An: KDE i18n-doc <kde-i18n-doc at kde.org>CC: "natalie_clarius at yahoo.de" <natalie_clarius at yahoo.de>Gesendet: Samstag, 11. Februar 2023 um 00:14:08 MEZBetreff: Re: Translation for time zone conversion runner
 Hello,

For some reason I am not getting the replies from the original author, but anyway. I’ve seen the author's message from Shinjo’s reply.

As a disclaimer, I do not think handling natural language through translatable strings is a good idea, instead it should be handled via quirks in code for every supported language. It is simply not covering all cases, and English being one of the simplest languages out there, it does not help either.

> <time> <from timezone> <in> <to timezone>
> 
> For instance, a user can type "10:00 CET in China" to convert 10:00 CET to
> China time and get 17:00 as a result.

Why the variables are in <> brackets, is there a technical reason for this? My translator side has gotten used to seeing input placeholders in <> brackets, and now I realise that there may be relevant mistakes already because of this present. And since these are not reorderable, it’s problematic for my language. Hopefully I won’t need to dive into code every time I see these again.

In Turkish, due to the Subject-Object-Verb word order, nearly all of the standart English structures need to be flipped. So, "10:00 CET in China” becomes “Çin’de 10:00 MAS” (China-in 10:00 CET, yes it’s agglutinative, so that “in” becomes an issue already. I can think of something like “10:00 MAS için Çin saati” (China time for 10:00 CET), but I am really not sure if this sounds okay, and really forced. Also I’d need separate variants for date and time. It should really be re-orderable.

> "18:00 UTC *to* CET”

I’d just go with something like "18:00 UTC, CET" or "18:00 UTC CET” for the lazy. Not that it’s correct, but omission of a certain textual location makes it easier to forego. This works very nicely for European languages I see, but not necessarily for others.

> The order "<time> <from timezone> <in> <to timezone>" is fixed, and was
> chosen in analogy to how the unit conversion runner works (e.g. "2 liters
> in milliliters"). Is there any language where this syntax for time zone
> conversion would not be natural at all?

In unit conversion it works in Turkish, because luckily "2 kg in mg” is informally used as “2 kg kaç mg” (2 kg how much mg), but not for time zone conversions which apparently requires a more delicate and formal language syntax.

As another example; in Japanese, it’s also possible to use the same unit conversion syntax as in Turkish (2キログラムは何ミリグラム - 2 kilogram is what miligram), but not for time conversion (中国では 10:00 CET - China-in 10:00 CET). Japanese also uses the same word order as in Turkish (my Japanese is nowhere near proficient though).

Even without any proper language use, word order and whatnot; the system should be smart enough to make assumptions independently of the language settings, and put out soma results without being bound to translatable strings. Otherwise you get frustrated people that gets the features advertised to them but still can’t print results due to some translator error or too literal code.

To improve this in general, it would be nice to make everything re-orderable and spread out English variants as much as possible, and having the each target translations for these variants support more than one outputs separable with ‘;’ where it makes sense. Also avoid string concatenation at all costs!!!

If something is not clear, let me know. It’s very late over here, I’m not sure all my words made sense.

Best regards,
Emir (𐰽𐰺𐰍)

** E-mail needs to stay simple
** Use plain text e-mail

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/kde-i18n-doc/attachments/20230212/b8b70671/attachment.htm>


More information about the kde-i18n-doc mailing list