Opened 6 years ago

Last modified 5 years ago

#6935 assigned Bug/Something is broken

setting up a web site for the MF/PL translation team

Reported by: https://id.mayfirst.org/josue Owned by: https://id.mayfirst.org/josue
Priority: High Component: Tech
Keywords: translation weblate drupal Cc: https://launchpad.net/~jeremyb, internationalization@…
Sensitive: no

Description

hey folks,

the internationalization team has developed a workflow for the translation work. i am including it at the end of the ticket. once it is prioritized, that will also be added to this ticket.

we want to test some possible options to implement the workflow. one we are looking at is a drupal module: translation workflow module https://drupal.org/project/translation_management

we need to set up a drupal site with this module installed.

gracias!

--josue


Process Starts

  • Someone produces a document in Eng/Span and posts that document to our system

Translation volunteers are notified

  • Translators receive notification email that there is a document waiting to be translated, with a link to a “cover sheet”, a page that holds some general info about the document in need of translation.
  • Ideas for information that we’d want on each cover sheet:

Type of communication, or final destination of text (email, website text, meeting materials, etc.) Preferred deadline, and some kind of scale for how “hard” of a deadline that is. Basically, I want something that’s a little more concrete and communicates time-sensitivity more clearly than our current high/med/low priority classifications. Direction of translation (en-es, es-en) Who posted this text Status indicators! These I see as extremely important for helping all volunteers know if the document still needs work, what stage it is at, etc. In our current program we have “assigned to” to help us figure that out. I think that “assigning” should not be automatic based on who posted. A document should only show itself to be assigned if a volunteer has agreed to translate it. Other status indicators could include: Is there a complete draft? Is feedback wanted? Did the translator indicate that the draft should not be used until it gets feedback? (I’ll talk about this later in the process) There could be boxes that are checked each time a person provides feedback. Finally, we could also have an indicator that the translated document was retrieved and used.

The person who has assigned the translation to themselves, performs the translation.

If we decide that we can have a computer translate some things (perhaps for material with less than xx words), then the person assigns themselves to be the proofreader.

Some demands we might want to make of the interface for this step of the process:

Maintain original formatting Special characters, of course Put the original and the translation side by side on the same screen Optional marks, comments, etc. to include editing notes and suggestions, questions from the translator for other translators to read and answer/think about, etc.

The translation volunteer has completed a draft. This person should be able to go back to the cover page and indicate that a draft translation exists. If the person wants to make sure the translation is reviewed before it is used for its final purpose, the cover page could have an option such as “hold until feedback received.”

If this option is selected, another email is sent out to all translation volunteers, letting them know that a draft is waiting to be reviewed. (It’s also possible we want this email to go out even if feedback is not “required,” just so the whole team sees that a draft was made, and can provide feedback if they choose to).

Again a link to the cover page is in the email, volunteers click on the link and can clearly see if anyone has already given feedback, or if nobody has. Anyone who provides feedback would be able to check a box on the cover page, or some other such indicator, to show they have reviewed the draft

The draft has editing marks or some other way of tracking who provided what edits, perhaps a way of accepting/rejecting changes, etc. Editing marks or comment marks could also be useful for the original translator to include questions, comments, etc. about nuances of the translation. This could help for translation volunteers to be able to discuss topics of concern, such as using culturally appropriate terms, selecting language that the audience identifies with, tone, etc.

Whomever is responsible for moving the translated doc on to its final purpose (perhaps this is another piece of info on the cover page?), gets an alert once a draft exists, and perhaps gets an alert each time feedback is provided. Whenever this person actually takes the draft and uses it in its final destination, this also is indicated on the cover sheet to the document.

And a wish for the wish-list:

For any documents that are posted to a website, wiki, etc., it would be great if folks could continue to edit and refine into the indefinite future. This again is related to honoring culture and the language used in the struggle within each language group, as well as to our goal of being able to continuously improve our vocabulary in each language. If a document is on the web, and we find out that there's a better term than is already being used, it would be great to be able to change the document within our translation system, and have any manifestations of the document that are online to automatically update. I'm not a programmer, but this sounds like a tall order to me. However, we are talking about the ideal world, here ;)

Change History (20)

comment:1 Changed 6 years ago by https://id.mayfirst.org/ross

  • Keywords translation added
  • Owner set to https://id.mayfirst.org/josue
  • Status changed from new to assigned

This sounds great josue. Jamie says you're more than willing to set it up, so I'm assigning this to you.

comment:2 Changed 6 years ago by https://id.mayfirst.org/josue

i set up translate.mayfirst.org with drupal 7. i downloaded the translation_management module. the 7.x-1.0-beta1 version has Dec 29, 2010 as its date, while the 7.x-1.x-dev has May 19, 2011. i decided to start with the dev version.

when i enable the module i get:

PDOException: SQLSTATE[42000]: Syntax error or access violation: 1101 BLOB/TEXT column 'strings' can't have a default value: CREATE TABLE {icl_image_status} ( `id` INT unsigned NOT NULL auto_increment COMMENT 'The primary identifier for an image.', `data` LONGTEXT NOT NULL COMMENT 'Contains the image path of other data for embedded objects.', `type` VARCHAR(32) NOT NULL DEFAULT '' COMMENT 'The type of data.', `language` VARCHAR(12) NOT NULL DEFAULT '' COMMENT 'The language of this data.', `tnid` INT unsigned NOT NULL DEFAULT 0 COMMENT 'The translation set id for this data, which equals the data id of the source data in each set.', `md5` VARCHAR(32) NOT NULL DEFAULT 0 COMMENT 'md5 of the data contents', `strings` LONGTEXT NOT NULL DEFAULT '' COMMENT 'strings in the image/data that need translation.', PRIMARY KEY (`id`) ) ENGINE = InnoDB DEFAULT CHARACTER SET utf8 COMMENT 'The ICanLocalize image translation statuses. Also for...'; Array ( ) in db_create_table() (line 2688 of /usr/local/share/drupal-7.20/includes/database/database.inc).

and when i try to just go to the main url i get:

Fatal error: Call to undefined function user_access() in /home/members/guillen/sites/translate.mayfirst.org/web/sites/all/modules/translation_management/icl_content/icl_content.dashboard.inc on line 9

i get the above errors after installing the core and translate modules that are a part of this module. when i try to enable the content module, i get the above errors.

i switched to the beta1 version and it seems to install properly.

to be continued...

comment:3 Changed 6 years ago by https://id.mayfirst.org/josue

ok, so the saga continues.

the drupal7 version of the translation_management module is a mess. i decided to take a look at the D6 version first, to see what the functionality is supposed to be. i created a D6 site, enabled the module, and got it all working.

these modules are designed to connect with different companies that provide paid translation, but they do have a Local Translators option. i chose that and added two drupal users with distinct permissions: Translation Manager and Translator.

i then enabled the multilingual options on a content type i created called Document. and then i created a Document that needed to be translated,

no auto-notification out of the box to the translators, but i can imagine Rules could be used to implement that.

the Translation dashboard lists the docs that could ne translated, with their status.

when the Translator was assigned a document to translate, it showed up on the dashboard and when clicking edit, creates a new page with the content inside the form, ready to be overwritten as it is translated. i was hoping for side by side fields.

okay, stopping here. this is the D6 version. D7 has to be improved if we are to use it.

comment:4 Changed 6 years ago by https://id.mayfirst.org/josue

more updates: just found this: http://drupal.org/node/988154#comment-5515640, which talks about this module being abandoned and all efforts going into this one: http://drupal.org/project/tmgmt

am installing and configuring this one. more updates soon...

comment:5 Changed 6 years ago by https://id.mayfirst.org/josue

okay, rapidly approaching my limits...

translate.mayfirst.org has the tmgmt module installed and configured, to the best of my ability.

http://drupal.org/node/1445820, which describes the features and basic concepts says: "Local users. Present a user with a two panneled page for immediate translation. Not yet functional"

so, they key component that we need is not yet implemented in this module.

will revisit this once the priority list for our workflow is added to this ticket.

--josue

comment:6 Changed 6 years ago by https://id.mayfirst.org/josue

been doing more research, so thought i would track this here...

in looking up translation management systems, they are mostly focused on the business end of managing translators that you are paying for work. these do not seem to have any mechanism for supporting the work of the translator, like a screen that shows you the doc and next to it a space for writing the translation. http://www.vitroff.com/default/features is an example we could look at.

we could complement such software with tools to support our translators. an example would be http://extensions.openoffice.org/en/project/TranslationTable, an extension for open office that creates a side-by-side environment for translating docs.

http://www.omegat.org/ seems like another tool that we should incorporate somehow. i think it would help in building a knowledge base of common terms and facilitate our translators being able to use the things learned from previous translations. but we would need someone to dive into this software to better understand how we could use it.

comment:7 Changed 6 years ago by https://id.mayfirst.org/josue

one more thing, for now...

http://wpml.org/ sells a plugin for wordpress ($79) that seems pretty nice.

WPML comes with state-of-the-art translation management. You can turn ordinary WordPress users into Translators. Translators can access only specific translation jobs which Editors assign to them.

WPML sends notification emails, provides a translation management screen, a jobs-queue and side-by-side translation editor.

something else to look at...

comment:8 Changed 6 years ago by https://id.mayfirst.org/jamie

Thanks for all the research Josue - it's amazing that something that seems so simple is not easily available.

I watched the WordPress plugin video and didn't see the one feature we need so much - side-by-side textara boxes (even though it advertises that feature).

I just applied for a login on translate.mayfirst.org so I can get a feel for what tmgmt can do for us.

Based on your research, I'm leaning toward asking the support team to help us add the side-by-side textarea feature we want to tmgmt and see if we can use it.

Also, tmgmt's workflow doesn't seem advanced enough for our needs. However, I think we may be able to use the workflow module instead. It would take more work to setup, but would allow us to be much more flexible.

I'll know more once I have a chance to tool around.

jamie

comment:9 Changed 6 years ago by https://id.mayfirst.org/dkg

  • Keywords weblate drupal added

fwiw, i think that weblate should be considered for this. I'm in touch with Michal Čihař, the upstream developer, and he seems quite happy to see it used for this sort of work.

weblate is under active development, is free software, and uses a pretty sensible stack

  • implemented in python with the django framework,
  • uses git to manage the actual text (so that people who want to do non-web-based or even offline work have a way to do it)
  • can use postgresql as an rdbms backend, and
  • should be able to do web-based authentication using apache modules instead of having yet another set of passwords to remember (i.e. we might be able to use mod_auth_openid or other mechanisms).

I suspect that if we have reasonable changes or new features we want to see in weblate, Michal would be willing to adopt them and merge them upstream.

I encourage someone to set up something like http://weblate.dev.mayfirst.org/ and give it a try. I'd be happy to advise on this, but i don't have the time to take it on myself right now.

comment:10 follow-up: Changed 6 years ago by https://id.mayfirst.org/jamie

I got a demo account at the weblate demo site and have tooled around a bit.

It has a lot of the features we want (see the user docs), including maintaining a project dictionary and plugins to various machine translation software plus an interface that shows a text box next to the translation string.

I'm still struggling to figure out if it can address some other needs:

  • Translating large documents - the web interface seems designed to translate short strings from a .po file. I'm not sure how the web interface would work with a large document.
  • Translator workflow - it seems well suited for random volunteers to translate various small strings, but I'm not sure it's possible to assign translators particcular projects (to avoid duplication of efforts when translating large documents)
  • I can't quite tell how to add things to be translated. It seems like you have to check in a document via git, which I think provides too big a hurdle for non-technical staff to submit things to be translated (at least without a web interface). I'm also not sure how it would work if a document was submitted for translation that wasn't in a .po format.

comment:11 Changed 6 years ago by https://id.mayfirst.org/roberto

hey folks -

sorry it's taken a minute to get back to y'all. the transition to nyc has taken more time than anticipated (surprise). thanks all for the research and work thus far. i took time to day to read all the entries and follow the links. offer three thoughts in the order that makes sense (at least in my head).

1) tech v people. in some of the links i followed the programs seemed to offer not only workflow mgt but also actual translation. same was mentioned in some of the threads. i would advise to not use a tech based translation until a fool-proof program has been developed. auto-translators miss cannot discern nuance and intent, both of which are abundant in our communications. following in a prioritized workflow predicated on human translators and the following suppositions: a) MF/PL membership not required to volunteer (ie access system) and b) the volunteers have the necessary skills to accurately translate (specifically register) MF/PL document. At a minimum, this means a working knowledge of both source and target languages. Until we work out a collective glossary, these folks should be able to get by with wordreference.com.

2) Prioritized work flow. This is stripped to basics to get us started. I've noted the extras for later development. Depending on progress, maybe we could do a test run with the next Lowdown(?) (which still needs a new name).

  • Source Doc - this should live outside of whatever container is built. In other words, most Lowdown articles (if i understand correctly) are created inside the system and moved along the trac. Ideally source docs should be able to originate anywhere - so I could draft something on the word processing software on my laptop and still be able to submit for translation.
  • Container - this is both the entry point for the document and a clearinghouse. Regardless of where the doc originates, this is where it is sent for storage and retrieval. It is here that the translators and/or admin person accesses it. Decision point: Can individual submitter access this clearinghouse or do they send to an admin, who then forwards to clearinghouse for dissemination? My gut says start with an admin at first - so that someone has a 360 view - and once the kinks worked out maybe change.
  • Notification. Volunteers are alerted to pending jobs. Decision point: Once in the container, is their an automated notice that goes out to a predetermined group (folks that have signed up and are plugged into system) or does the notice go to an admin who then forwards? Note: I don't think an admin is necessary in both steps (container and notification) unless you want super strict controls or you have someone with the desire/capacity to play that role. As long as there is some central person(s) in one of these two steps to make sure that assignments aren't falling through the cracks, should be fine.
  • Access - this is the point where the volunteers pick an assignment, access it (either inside a program or via download), translate it, and close it.
  • Return - to jump ahead to Penny's doc a bit: I don't think the translators should be responsible for making sure the doc gets to its final destination (website, email, etc). Ideally, they should be able to put back in the clearinghouse marked done and the originator should be alerted somehow.

This is the basic outline for the process and I would suggest we work on these first. Below, some thoughts of the pieces highlighted in Penny's doc.

  • Cover sheet: the cover sheet concept is good. Whether in a cover sheet or a GUI (see there - im learnin') format, the following info should accompany the doc request
    • Type of communication (letter, webpage, etc)
    • Deadline - I agree w Penny should not be ambiguous. Just a deadline date. Folks will either have it done by then or not. If not, it will need to go out again. There is a front-end conversation about guidelines and agreements for the volunteers as to responsibility in choosing assignments.
    • Direction of translation (source to target)
    • Author (in case have to check in re original content). This can be tricky tho, as the translator's job is not to edit for clarity vis-a-vis intent.
  • Formatting: for straight word processing docs, translators should copy formatting. For webpage text requiring knowledge of html or whatnot, straight text is preferred and web person can take care of formatting.
  • Status indicators: This opens up the conversation about a revision, editing, and proofing loop. Which is important part of a comprehensive translation pipeline. I would say that we forgo building this component now, especially if point 1, supposition b hold true. Post having worked out a basic flow and having set up a protocol for volunteers (recruitment, assessment, etc) we can tackle this piece.

3) The links. Real quick thoughts on the links I followed:

  • tmgmt: looks doable if we can make do what we need it to do
  • vitroff: couldnt really get a feel for it
  • openoffice/translationtable: looks great w the side by side (would definitely get you folks major points)
  • omegat: good memory tool tho not critical
  • wpml: looks awesome, super easy to use
  • weblate: like the idea, not the most user-friendly. depends on the learning curve.

I think a key thing to remember is that not all these folks are techies and are already donating their time and talent, so the easier it is for folks to navigate the system, the more they will be inclined to plug in.

OK enough for today. Thanks all r

comment:12 in reply to: ↑ 10 Changed 6 years ago by https://id.mayfirst.org/dkg

Replying to https://id.mayfirst.org/jamie:

  • Translating large documents - the web interface seems designed to translate short strings from a .po file. I'm not sure how the web interface would work with a large document.

I've asked Michal about this myself, and he said:

I don't think there is anything special needed for translating documents. For example The Debian handbook or documentation for phpMyAdmin is being translated using Weblate.

jamie wrote:

  • Translator workflow - it seems well suited for random volunteers to translate various small strings, but I'm not sure it's possible to assign translators particcular projects (to avoid duplication of efforts when translating large documents)

How mf/pl would break out the documents into projects or elements within a given project is a good workflow question. I suspect that some experimentation would be needed to see what the different tradeoffs would be. I'm also not sure that weblate has a native concept of deadlines, but this strikes me as a feature that we could add and would probably be appreciated upstream.

  • I can't quite tell how to add things to be translated. It seems like you have to check in a document via git, which I think provides too big a hurdle for non-technical staff to submit things to be translated (at least without a web interface).

This also sounds like a feature that would be straightforward to add, and probably appreciated upstream.

comment:13 Changed 6 years ago by https://id.mayfirst.org/penny

It's been great reading everyone's feedback and research. Here are a few of my thoughts in response:

Roberto's prioritization for cover sheet/GUI: I agree that it could get complicated listing Author for the sake of the translator getting in touch about the document, and I agree with your reasons. It was not my intent to suggest that the author and translator should consult on the translation/clarification of a text. I DO think that listing Author is good practice, for accountability and transparency. However, what I think is the most practical - if we're talking about bare-bones, top-priority-items-only cover sheet - is not author, but the person responsible for moving the document to its final purpose.

Speaking of priorities lists, the list of programs people are researching is getting long. I am wondering if we are at a point where it would be good to start comparing/eliminating some of these translation programs? I like to be able to compare everything quickly, so for my brain, something like the below list is helpful. Do others feel like they are starting to settle in on a preferred program? or honing in on what parts of each program to pull for our franken-translation system?

Your positive comments, greatly condensed:

  • Vitroff/Drupal7: dashboard with doc status
  • Extensions.openoffice: side-by-side
  • omegat: project dictionary, good memory
  • WPML: side-by-side, can assign translators, email notifications, easy to use
  • TMGMT: I can't say I've seen specific, good feedback about this? Any argument for using this/pulling from this?
  • Weblate: project dictionary, adaptability (but is it more adaptable than others?)

I know this list doesn't include every benefit of each program, but these are the benefits that I see explicitly stated in this thread.

comment:14 Changed 6 years ago by https://id.mayfirst.org/dkg

Some notes:

  • While the openoffice extension and OmegaT both sound cool, they look like entirely client-side tools, and don't offer any clear mechanism for collaborative work -- i don't think these are solving the same problems as the others.
  • WPML does not appear to be free software.
  • comment:4 links to a page that suggests that tmgmt is the ongoing focus of drupal translation management, so it should be grouped with drupal.
  • Vitroff is distinct from drupal, and is ostensibly GPLed, but my attempts to get the source code resulted in either "Can't process the form, please try again later" or "Please try to correctly fill out the form". I've sent them an e-mail at supp@vitroff.com, but haven't heard back from them yet. This doesn't seem like a realistic option, if we can't even get the source code.
  • Vitroff is also apparently reliant on php and mysql, which i consider a mark against it, but other people may feel differently.
  • Weblate also offers a dashboard with doc status, i think either per-project or as a whole system.
  • Weblate is free software, which means there is no vendor lock-in, and it means that any improvements we make to weblate to make it suit our purposes better can be offered to support anyone else who is attempting the same sort of work.

comment:15 Changed 6 years ago by https://id.mayfirst.org/dkg

  • Summary changed from Requesting a translation drupal site to setting up a web site for the MF/PL translation team

I just got a personalized link to download the source for vitroff. I'm not sure why it's personalized, since the source code is ostensibly GPL'ed, but maybe the vitroff upstream doesn't really understand free software licensing as well as they'd like to.

I'm reviewing the source now.

comment:16 Changed 6 years ago by https://id.mayfirst.org/dskallman

Just wanted to note that for WP, I've used http://wordpress.org/extend/plugins/polylang/, works great. I wouldn't recommend WPML, it's paid/not open overwhleming.

Outside of that, wanted to check in on this? Is it still a work in progress or wip?

comment:17 Changed 6 years ago by https://id.mayfirst.org/roberto

i was wondering the same thing. to the best of my recollection, folks were throwing out ideas of different options of ways to go and there was some good back and forth. aside from weighing options, i dont recall a concrete plan to implement something. anyone else?

r

comment:18 Changed 6 years ago by https://id.mayfirst.org/dskallman

Checking in an update here.

comment:19 Changed 6 years ago by https://id.mayfirst.org/tezcatl

Thanks for the recommendation Dana, it looks like a great plugin and a very useful alternative to premium WPML, indeed I'm going to propose that to an organization I'm working for.

I've read this conversation and looking there are out there so many alternatives, my question is: Do we really need another website just for texts translation?

IMHO, "smo" has done a great job so far, even I've tried before translating something in a wiki page and importing the job in the ticket using the Include Macro: #6403, enabling us to review translations and just make necessary changes instead of whole copy-paste of versions, and being really difficult to diff the changes made for other translator.

Instead of setting up a whole different platform for this work (Not saying I'm not curious, just requested an user account on translate.mayfirst.org in order to give it a test), How about a custom workflow, and deadline features added to Trac in order to make the system more suited to the features we agree on?

comment:20 Changed 5 years ago by https://launchpad.net/~jeremyb

  • Cc https://launchpad.net/~jeremyb added

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.