cloud failure modalities

There’s a tale of woe getting some airtime on the interwebs from an angst-ridden New York undergraduate (reading between the lines) who has somehow had an entire, quite substantial, google account deleted. The post’s contention is (or includes) the idea that deleting such a profile is tantamount to deleting one’s life, I think. The facts of the case are murky – I’d link to some Google+ discussions, but I can’t find a way to do that – but regardless of this particular young person’s predicament, the story highlights some bigger questions about trusting cloud services.

At ground level, using cloud services is good for resilience, because the service provider is much better placed than I am to build highly reliable infrastructure with negligible downtimes and routine automatic backups.  It’s information processing as an abstracted service. It’s wonderful. It just works.

Of course, the wise user immediately asks what happens if you scratch the abstraction a little.  Does the service provider derive benefit from mining my data – and does this impact my privacy?  Worse, are some employees of the service provider (or, if they are slack, random other internet users) able to eavesdrop upon my communications or mine my data for more sinister purposes?  This is much less wonderful, and aside from some pie-in-the-sky discussions of fully homomorphic encryption, or some slightly more hopeful architectures with trusted computing, there is little or no known solution right now.

But at the next level, it starts to get interesting.  What are the credentials which provide access to this wonderful world?  What if I forget them? Or worse, what if I disconnect for a couple of decades from a particular archive, and then later want to re-establish my ownership and control over the data I placed there?  Few credential recovery schemes are appropriate in the long-term cases.  That situation is compounded further if the data belongs not to an individual but to some corporate entity.  The lesson – make sure your inventory management processes involve rolling forward all your credentials on a regular basis.

And then, there’s the case in point: what if I am in dispute with the service provider.  The data I placed with them is unambiguously mine, not theirs, but I don’t think our system of law yet really places on them an obligation not to destroy it.  I think that would have to be governed by contract – and I doubt that many cloud service providers today would want to write that into their contracts, and they hold all the cards.  I suspect some cases will have to go to court before that particular set of rights and obligations gets codified a little better.  Michael Wisner likens the situation to that of landlord and tenant: the former may evict the latter, but must provide reasonable means for the recovery of the tenant’s property.  This has some mileage, I think.

The topic mentioned in an earlier article is also germane: the legal jurisdiction in which the service provider sits may be far from clear – and it may or may not be aligned with the physical location(s) of the disc(s) which hold (the copies of) my data.  The possibility that my data may be trawled (or my interactions eavesdropped) is very real: as is the possibility that the service may be suspended by a court order, even if the data/service under investigation is not mine. Related, of course, is the blocking of my service by an over-zealous or clumsy filter, even if I am accused of no wrong-doing.

Finally – at least in the simple cases – there is the problem of the provider who is struggling to do business.  Perhaps they are insolvent, struggling to pay their staff; perhaps their IT is crumbling, or worse, subject to sustained attack; perhaps their business is simply undercut by a rival with sudden visibility.  If too many of the provider’s customers decide simultaneously to extract their data, they will push the provider past their peak load capacity – at the very time when the provider is not well-placed to add extra capacity dynamically.  So services will start to fail, data will cease to flow, and the provider’s predicament will worsen.  The situation is not unlike a run on the bank.

In a service-oriented world, we may assemble numerous services, from several providers, in order to satisfy our own needs, or in order to provide our own services to third parties.  All of the above problems may apply to the services I am relying on – and none may be evident to my own customers.  The abstraction, once again, looks pristine, yet it is rotten inside.  So the additional failure mode is simply a random combination of all of the above.

Are clouds good for resilience?  Are we looking in the right place to try to decide what to do about it?  I’m far from sure.

Some standardized interfaces for data transfer might help.  There’s a potential market for cloud services which back-up the data from other cloud services.  Of course, in buying those I might imagine I was adding resilience through redundancy, without realising that the second service is actually co-located with the first!  The complexities of authentication and delegation would also require even more attention than they already do, in that multiple-provider world of redundancy.

Are all of these things worth losing sleep over?  How do the risks fall out?  Do the benefits of having the cloud outweigh the problems in the margins?  Would that we knew.

One thought on “cloud failure modalities

  1. Pingback: footnote | Systems Security