At Oasis, we’re committed to finding the best legal technology for
our customers. And considering that so many matters have a need for a foreign
language component these days, we thought it was about time we take a deep dive
into what the market had to offer in terms of machine translation. We attended
demos, created a control set, compared results, and evaluated each for six characteristics:
cost, usability, market flexibility, accuracy, overall presentation, and growth
We learned a lot and thought we’d pass our lessons along to those
in need of a strong translation solution.
Before You Dive In
In the context of litigation document review, there are three options when encountering documents in a language in which you are not fluent: 1.) Hire a document review/staffing company that can review the documents in the given language, 2.) Pay a human to translate the original into the native language of the reviewers, or 3.) Use automated translation tools that use algorithms to translate—i.e. “machine translation.”
Option one—native review—is the most accurate and most expensive. Option
two—human translation—gives a high-quality output but can cost a fortune
considering you’re paying for the translation on top of the review. Option three—machine
translation—gives you the lowest quality output but is fast and (relatively
Our evaluation focused exclusively on technology solutions in the
third category, machine translation tools, as they are the most common method
for foreign language document review.
We reviewed seven MT technologies in terms of their cost, usability, market flexibility, accuracy, overall presentation, and growth potential. The tools included in our evaluation were:
Each application was reviewed for the criteria provided in this evaluation. To accomplish this, a system was designed to fairly rate each product for its overall presentation, cost, usability, accuracy, market flexibility, and growth potential. To obtain these details, we engaged and participated in product demos with the software vendors, reviewed documentation and information on each tool, and held review discussion panels with our team of technology experts and industry veterans—the smart kids in the class. From these, we generated a ranking system of 1-5 in the categories we identified as areas helpful to understand these tools. The scores were utilized to provide each application’s information and findings presented in this evaluation.
These tools are all remarkably similar.
That’s not to say there aren’t interesting differences that could tip the
scales for different use cases, but it’s not possible to boil this decision
down to something simple. Honestly, we were hoping for more definitive results.
As a result of our findings, we created a simple diagram that displays the
relative strengths and weaknesses of each application instead of ranking the
technologies from best to worst. We figured this would be more helpful for you
to review and choose the tool that fits your needs. We’ve also included some of
our analysis for each of the six criteria, along with a few key takeaways for
each of the products we reviewed.
Prices vary some, but not enough to make price a useful differentiator in an on-premise SaaS use case. Most tools have a per-word pricing model, however, some present pricing as a per page or by document rate. Either way, the math works out to be reasonably comparable. It is worth noting that we found most of these companies to be more than willing to work with you on pricing where needed.
Park IP rated high for usability. While all these tools have Relativity integration (the most common eDiscovery review platform), we looked for those that were also user-friendly, intuitive, and had a well-developed interface for ease of use in or out of a review tool.
Some tools are built for specific industries, languages, and use
cases; others are designed to apply in many different circumstances. While most
of these applications have some capability of being multi-industry friendly,
Authenticity.ai and Systran have a slight edge on the others with their ability
to handle many industry data types.
It’s very, very hard to tell. We poured over the results to try and see a difference, but the fact is, none of them are perfect. However, given the complexity of the task, perfect isn’t really the expectation. Many of these applications use the same backend technology (open-sourced from Google) with their own special sauce built on top. That said, two did stand a bit taller than the rest in the scoring: Veritone and LSI, with Park IP a very close third.
Some of these tools are intuitive and present well: Park IP and Authenticity
really stood out in this regard due to their modern user interfaces and
well-designed workflow, as well as their solid foundations of the tools.
Although we didn’t do a formal evaluation of everyone’s roadmap, we
did think it worth mentioning that some of these technologies show tremendous
promise if they come through on their planned roadmaps. To be sure we aren’t
divulging any trade secrets, we won’t get specific about their future plans.
However, we will be keeping an eye on Veritone, Park IP, and LSI; each is
working on some exciting new developments.
While our current suite of technology includes LSI and Veritone—which offer a lot of coverage—we’re looking at some of the other technologies for special use cases. If you’re considering MT for a solution on an upcoming project, please reach out to us. We’d love to dive into the deep end with you.