A Deep Dive on 7 Language Translation Tools

At Oasis, we’re committed to finding the best legal technology for our customers. And considering that so many matters have a need for a foreign language component these days, we thought it was about time we take a deep dive into what the market had to offer in terms of machine translation. We attended demos, created a control set, compared results, and evaluated each for six characteristics: cost, usability, market flexibility, accuracy, overall presentation, and growth potential.

We learned a lot and thought we’d pass our lessons along to those in need of a strong translation solution.

Before You Dive In

In the context of litigation document review, there are three options when encountering documents in a language in which you are not fluent: 1.) Hire a document review/staffing company that can review the documents in the given language, 2.) Pay a human to translate the original into the native language of the reviewers, or 3.) Use automated translation tools that use algorithms to translate—i.e. “machine translation.”

Option one—native review—is the most accurate and most expensive. Option two—human translation—gives a high-quality output but can cost a fortune considering you’re paying for the translation on top of the review. Option three—machine translation—gives you the lowest quality output but is fast and (relatively speaking) cheap.

Our evaluation focused exclusively on technology solutions in the third category, machine translation tools, as they are the most common method for foreign language document review.

Our Evaluation

We reviewed seven MT technologies in terms of their cost, usability, market flexibility, accuracy, overall presentation, and growth potential. The tools included in our evaluation were:

Each application was reviewed for the criteria provided in this evaluation. To accomplish this, a system was designed to fairly rate each product for its overall presentation, cost, usability, accuracy, market flexibility, and growth potential. To obtain these details, we engaged and participated in product demos with the software vendors, reviewed documentation and information on each tool, and held review discussion panels with our team of technology experts and industry veterans—the smart kids in the class. From these, we generated a ranking system of 1-5 in the categories we identified as areas helpful to understand these tools. The scores were utilized to provide each application’s information and findings presented in this evaluation.

Results

These tools are all remarkably similar. That’s not to say there aren’t interesting differences that could tip the scales for different use cases, but it’s not possible to boil this decision down to something simple. Honestly, we were hoping for more definitive results. As a result of our findings, we created a simple diagram that displays the relative strengths and weaknesses of each application instead of ranking the technologies from best to worst. We figured this would be more helpful for you to review and choose the tool that fits your needs. We’ve also included some of our analysis for each of the six criteria, along with a few key takeaways for each of the products we reviewed.

Strengths and Weaknesses

Cost

Prices vary some, but not enough to make price a useful differentiator in an on-premise SaaS use case. Most tools have a per-word pricing model, however, some present pricing as a per page or by document rate. Either way, the math works out to be reasonably comparable. It is worth noting that we found most of these companies to be more than willing to work with you on pricing where needed.

Usability

Park IP rated high for usability. While all these tools have Relativity integration (the most common eDiscovery review platform), we looked for those that were also user-friendly, intuitive, and had a well-developed interface for ease of use in or out of a review tool.

Market Flexibility

Some tools are built for specific industries, languages, and use cases; others are designed to apply in many different circumstances. While most of these applications have some capability of being multi-industry friendly, Authenticity.ai and Systran have a slight edge on the others with their ability to handle many industry data types.

Accuracy

It’s very, very hard to tell. We poured over the results to try and see a difference, but the fact is, none of them are perfect. However, given the complexity of the task, perfect isn’t really the expectation. Many of these applications use the same backend technology (open-sourced from Google) with their own special sauce built on top. That said, two did stand a bit taller than the rest in the scoring: Veritone and LSI, with Park IP a very close third.

Overall Presentation

Some of these tools are intuitive and present well: Park IP and Authenticity really stood out in this regard due to their modern user interfaces and well-designed workflow, as well as their solid foundations of the tools.

Growth Potential

Although we didn’t do a formal evaluation of everyone’s roadmap, we did think it worth mentioning that some of these technologies show tremendous promise if they come through on their planned roadmaps. To be sure we aren’t divulging any trade secrets, we won’t get specific about their future plans. However, we will be keeping an eye on Veritone, Park IP, and LSI; each is working on some exciting new developments.

Conclusion

While our current suite of technology includes LSI and Veritone—which offer a lot of coverage—we’re looking at some of the other technologies for special use cases. If you’re considering MT for a solution on an upcoming project, please reach out to us. We’d love to dive into the deep end with you.