Using Cloud Solutions for Translation: Yes or No?
We read about the benefits of using the cloud for work—using cloud applications and storage, for example. What we don’t see are warnings of the risks. You have to look for these specifically; the information doesn’t come to you as do the claimed benefits. This morning I got yet another e-mail from my web hosting service, encouraging me to use their new cloud storage, and I am tired of receiving iCloud notifications on my phone when I specifically chose not to use that service. So I’d like to share some things I learned in a graduate course on cloud computing at Boston University a couple of years ago, including some essential information from “Cloud Computing, A Practical Approach” by Anthony T. Velte, Toby J. Velte, and Robert Elsenpeter, to explain why using the cloud may not be such a good idea, at least for our work.
First of all, we should understand that cloud computing is not for everyone and it is not for everything. Just because it’s there and offers some benefits doesn’t mean we should use it. According to the authors, whether or not we should use cloud computing depends on a number of factors, including whether our data is regulated. Is our data—i.e. the original texts we translate and our translations—regulated? Well, the original text is not even our data, it is the client’s. And our translations are different-language equivalents of someone else’s data. So even if it is not regulated by the client, it’s still not our data to share.
Now I could just stop the article here. It is not our data, we simply do not have permission to store it on third-party equipment or manipulate it with third-party applications. End of story. But for the curious, I’ll give some more information.
What does “using the cloud” really mean? What would we use exactly? What is the cloud?
Based on conversations I’ve had with colleagues, many see the cloud as something obscure, something abstract, pretty much like a real cloud without a specific shape or form, something up there, hard to conceive, something shared by many or by all. Actually it is something very specific and definitely not abstract.
These are the three major implementations of cloud computing:
1. Compute Clouds
2. Cloud Storage
3. Cloud Applications
Compute clouds allow us to access applications and on-demand computing resources maintained on a provider’s equipment; examples are Amazon’s EC2 and Google App Engine. (On demand resources means that you don’t have to have the infrastructure on your own equipment and run the code; the resources are on someone else’s equipment and you use and pay for them only when you need them.)
Cloud storage is the most popular implementation. It allows us to maintain our files on a cloud-storage vendor’s equipment. (This is what my website hosting service keeps bothering me about. I am not interested, thank you very much.)
Cloud applications are similar to compute clouds in that they allow us to use applications maintained on a provider’s equipment; the difference with compute clouds is that cloud applications use software that rely on cloud infrastructure, i.e. they depend on the infrastructure of the Internet itself. Examples are Skype (peer-to-peer computing), MySpace or YouTube (web applications, delivered to users via a browser), and Google Apps (Software as a Service – SaaS).
Let’s consider the translation of a medical record. I won’t even go into discussing the habit of some translators to ask terminology questions on open translation portals and include the patient’s name, because that is simply beyond me. It is inconsiderate, unacceptable, inconceivable! But that’s another story. Let’s focus on the cloud. So let’s say that you want to store the translation on some backup directory you have on the cloud, i.e. on someone else’s equipment, or translate a few sentences with an online translation tool or a CAT tool that uses a shared memory stored on a cloud server (requiring you to also save your translation in the shared memory). What is the problem with that? According to Volpe et al.:
“If you want to use cloud computing and post data covered by Health Insurance Portability and Accounting Act (HIPAA) on it, you are out of luck. Well, let’s rephrase that—if you want to put HIPAA data on a cloud, you shouldn’t. That’s sensitive healthcare information and the fact that HIPAA data could commingle on a server with another organization’s data will likely get the attention of an observant HIPAA auditor.”
No matter how much cloud giants like Google and Microsoft try to reassure us that the data placed on a cloud are safe, all it takes is one tiny breach to let sensitive data loose. And of course this raises another question: if the data is let loose, who is liable?
According to the authors: “If you have data that is regulated—like HIPAA or Sarbanes-Oxley—you are well advised to be very careful in your plans to place data on a cloud. After all, if you have posted a customer’s financial data and there’s a breach, will they go after the cloud provider or you? […] It is probably best to avoid a painful fine, flesh-eating lawyers, and possible jail time.” Note that jail time can be 1 to 10 years for HIPAA and up to 20 years for Sarbanes-Oxley data. I won’t mention the financial penalties because I’d like to spare you the heart attack, but if you’d like to know about them, I refer you to this book, page 26.
Even if the customer considers going after the cloud provider too, chances are the cloud provider has already foreseen this possibility and has made sure to absolve itself of any responsibility in its agreement with you. If you want to know Google’s attitude towards confidentiality, I refer you to a couple of old articles of mine, “Confidentiality and Gmail” and “Confidentiality and Google Translate” where you’ll see that by accepting Google Translate’s terms of service we grant Google permission to use our content to improve its services.
I’m not saying that cloud providers like Google are after you. And not all applications are like Google Translate which wants to gain something from your translations. In fact the big vendors have strict security measures. What I am saying is that you should not count on the cloud provider to protect or respect the confidentiality of your data or your client’s data. In spite of the provider’s security measures, you are responsible for keeping your data secure.
So the cloud provider is not after you. But you know what? Someone else is. Can you guess who?
Hackers! Yes, hackers can cause a lot of damage if they get access to your data or your client’s data. They can get access to the company trade secrets you translated and sell them to the company’s competitor. They can get access to a company’s proprietary information and threaten to disclose it if they don’t receive a very generous sum. There are too many scenarios to list. Use your imagination and know that these things do happen. And on a not-so-funny note, when I took a “certified hacker” course (wait, let me explain, I worked as a software quality engineer for a while, where testing the quality of software products also meant testing security, and to test security you need to know how to break the software, hence the course, paid for by the company.) I was shocked to learn that some hackers do it for …fun! Just because they can and just because they want to test themselves. This too happens. You don’t want them knocking on your door and telling you “you either pay me 20,000 dollars and you get all those financial records back or pay 100,000 dollars in fines for confidentiality breach”. It sounds far-fetched and maybe it doesn’t happen often, but it can happen. Hackers are mostly after larger corporations, not individual translators; on the other hand, when they hack into data stored on a cloud, they care more about the data than about who put it there.
What does all this mean? Should we never store data on a provider’s cloud or use cloud services?
Not necessarily. If you want to store data on a provider’s cloud, one thing you can do is encrypt it. Look for programs like TrueCrypt (www.truecrypt.org) to do this. That way, if someone gets access to your data, they won’t be able to read it.
Another important thing you should consider is to look for paid services instead of services funded by advertising. When it comes to free cloud services, Volpe et al. point out that they “are most likely to rummage through your data looking to assemble user profiles that can be used for marketing or other purposes. No company can provide you with free tangible goods or services and stay in business for long. They have to make money somehow, right?”
Last but not least, always, ALWAYS read carefully your agreement with any cloud service provider. Make sure you understand the privacy and security implications of using said service and that you understand and agree with the terms of service.
Now, what if you are working in a translation team and need to exchange terminology databases or translation memories? It may be convenient to use a cloud service, but is it safe? And what if you don’t have a say in this, what if your client does not provide an in-house server but wants you to use a cloud-based service/application? In that case using a cloud service may actually be a good idea and make your team’s work easier. But what about confidentiality? Well, if your client is the one that requested you to use that service and is coordinating the workflow, then you are not liable if the cloud provider’s security measures are breached (though it’s a good idea to double-check with the client anyway). If you are using a cloud solution for a project for a direct client, then you may want to follow the above advice and look for a paid service and read the user agreement very carefully. Tell your client that you are using a cloud service and make sure he gives you permission in writing. Most end clients don’t care about the details of your process, they don’t care if you’re using such and such CAT tool or terminology-management tool, but when using a cloud service it is advisable (read: advantageous to you, in terms of liability) to have your client’s permission.
So to the original question, “Using cloud solutions: yes or no?”, my answer is this:
– If you don’t need them, don’t use them.
– If they increase your efficiency or generally improve your work process, use them but make sure your client knows and make sure you agree with the terms of service. It is safer to use a paid service.
– For storage, if it makes sense for you to store sensitive data (your own data, not a client’s) on a cloud, encrypt it first.
And keep in mind that you don’t have to follow the crowd; just because many people use a certain cloud service doesn’t make it any safer. Consider your own needs and the sensitivity of your data; that is, your clients’ data.
Ilustrated by Juan Tavela