STC Toronto meeting: automated editing with artificial intelligence

This meeting took place on October 12, 2006; our faithful meeting reporter Ed Belizynski wrote about it later.

STC Toronto general meeting: STC Annual Conference / AI Copy Editor

— by Ed Beliczynsk (ed dot techwriter at gmail dot com)

This meeting was comprised of two parts.

Richard Mann presented on the upcoming STC Annual Conference (May 13-16, 2007)

Richard (www.richardmann.net) easily showed that the Twin Cities of Minneapolis-St. Paul is a beautiful area with many outstanding places to see and visit no matter what your interests are. The 540 members of the local chapter will be on hand to greet and assist all STC visitors to help make their stay enjoyable and productive. Please visit the STC website (www.stc.org) for the latest details.

Kent Taylor spoke about a software copy editor with artificial intelligence

I have no doubt that in the mind of some science fiction author somewhere there exists a world of advanced technology where computers rule entire civilizations and govern a society of peace and prosperity. In that world computers have probably evolved to the point where they can write entire articles, novels, or user guides and intelligently edit them to masterful perfection. Taking one look at the “perfection” revealed behind Microsoft Word’s grammar feature shows us that our reality is far different. Computers have, at best, stumbled upon the English language trying to awkwardly make sense of a tongue that is not their own. Driven by logic, they have the advantage of being tireless and of almost never making an error on a predefined task for which they were programmed.

Human writers, on the other hand, may be flawed, but have a plethora of tools at their disposal. Creativity, ingenuity, instinct, and experience can all play in the favour of a talented writer. Knowing when to break rules can lead to moments of inspiration and offer up better solutions than predefined routes. Human writers can refer to a multitude of style guides and even consult other writers whose combined knowledge pool and skill set is, evidently, great.

Now imagine a world where the best of what computers can do is combined with the advantages that humans bring to the editing of the written word. This is the world of Acrolinx.

Our presenter, Kent Taylor, is a lifelong Technical Writer currently living in “middle of nowhere” Colorado. He has managed publications at AT&T and Lucent for nearly 20 years and has over 30 years total documentation experience. Acrolinx is a company which was spun off from the German Research Institute for Artificial Intelligence. Perhaps my earlier science fiction reference was not that far off. Acrolinx has leveraged the strengths of a field named Computational Linguistics to create a client/server copy editor program (Acrocheck) which intelligently corrects grammar, adheres to style guides, and checks wording for clarity and succinctness. The STC audience was very impressed.

In showcasing Acrocheck, one of the areas that Kent focused on was translation. Machine translation (MT) in its current form is very fast, but you need to edit about 50% of the content afterwards. If you can improve the quality and content of the writing, you can get rewriting costs down to about 20%.

The “Cost” of Quality for a Specific Project

  • Savings of as much as $1 million/year so far
  • Translation costs for MT projects cut by 50%
  • Time to market for MT projects cut by 50%
  • 40% content reuse
  • 75% reuse for localization

Pre-editing is where you clean up the consistency, scope, and correctness of the source material. Not only would you correct the obvious grammatical errors, but you would limit the range of the words used. Instead of using 500 words to write a paragraph, you might use 100 predefined words. I’ll simplify here, but consistently writing “Click” instead of “Press”, “Select”, or “Use” limits the scope of your vocabulary and makes for easier machine (or human) translation. Lastly, with pre-editing, every word in your vocabulary has only one meaning. This leads to less ambiguity (and mistakes) in translation and also a clearer meaning for English as a Second Language readers.

Acrocheck has a writer/editor interface with on-demand checking and guidance. The program operates as a plug-in to your favourite writing application be it Word, FrameMaker, Arbortext Editor, AuthorIT, or “just about everything”. Since the application functions in a client/server fashion, any created rules reside on the server and are available to all clients.

The Acrocheck writer style guide was created by analysing many style guides (such as the Chicago Manual of Style) and finding the commonalities. Computational linguists at Acrolinx found that 80-90% of rules are the same across all guides. With that observation, they built those common best practices into the system.

After each check, an XML document is generated to inform the writer of the changes made and their scope. You can specify which aspects of a document you wish to analyze. In addition, the program looks at new terminology and intelligently integrates those terms into its knowledge set.

Writing departments can “batch check” a number of documents and create an aggregate report for quality control. Common errors will be flagged, and with a bit of gentle reminding by the program, will no longer be made by writing staff. With the aggregate report you can view details such as style, grammar, terms and spelling.

A system administrator can manage terminology by importing term banks and add, edit or delete rules. The complex style rules are customized by computation linguists and, for added flexibility, you can have multiple style sheets and terminology dictionaries.

With reports, Kent suggested that you could set up service level agreements with translators and tell them that you’d like to get a better rate if you submit only green (Acrocheck approved) documents. Cleaning up your text to make it easier for the translators to work with could provide a 10-30% reduction in translation costs.

The benefits of machine processing can be summarized in a number of ways:

  • Localization with more consistency and more reuse
  • Guaranteed source material quality
  • More automation

Support call deferrals:

  • More consistency for better on-line searching and better self-help
  • Fewer support calls
  • Reduced product liability risk
  • Certifiable quality

Internal process efficiencies and cost savings:

  • Less rework
  • Less copy editing
  • Cleaner handoffs between process tasks

An initial setup of Acrocheck typically takes a couple of weeks. Afterwards, smaller customizations take one or two days. As you may have imagined, this is not an inexpensive product and is geared towards large organizations with significant writing needs.

After this presentation I noticed a tingle in the room that one feels when they see something exciting and inspirational. Unfortunately, most of us will not experience the convenience offered by a product like Acrocheck, but we can always hope that one day a personalized version will appear for the masses.

Thanks once again to Kent Taylor for his presentation. He can be reached at “kent at acrolinx dot com.” See also www.acrolinx.com.

Ed Beliczynski is a Technical Writer/Trainer for ExtendMedia Inc. and an IT professional who has transitioned from the world of programming. Ed’s eclectic background spans the worlds of video, finance and music. Once the front man of a progressive rock band, in his spare time he now struggles to stay off the internet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: