email retention

Someone in the group suggested a blog post on email retention.  It’s a good topic, because it tracks the co-evolution of technology and process.

The Evolution

Back in the day, storage was expensive – relative to the cost of sending an email, anyway.   To save space, people would routinely delete emails when they were no longer relevant.

Then storage got cheap, and even though attachments got bigger and bigger, storing email ceased to be a big deal.  By the late 1990s, one of my colleagues put it to me that the time you might spend deciding whether or not to delete something already cost more than the cost of storing it forever.  I have an archive copy of every email I’ve sent or received – in professional or personal capacities – since about 1996 (even most or all of the spam).

Happily, one other technology helped with making email retention worthwhile: indexing.  This, too, is predicated on having enough storage to store the index, and having enough CPU power to build the index.  All of this we now have.

However, a third force enters the fray: lawyers started discovering the value of emails as evidence (even if it’s rubbish as evidence, needing massive amounts of corroboration if it is not to be forged trivially). And many people – including some senior civil servants, it seems – failed to spot this in time, and were very indiscreet in their subsequently- subpoenaed communications.

As a result, another kind of lawyers – corporate lawyers – issued edicts which first required, and then forced, employees of certain companies to delete any email more than six months old.  That way, they cannot be ‘discovered’ in an adverse legal case, because they have already been erased.

Never mind that many people’s entire working lives are mediated by email today: the email is an accurate and (if permitted) complete record of decisions taken and the process by which that happened.  Never mind that emails have effectively replaced the hardbound notebooks that many engineers and scientists would use to retain their every thought and discussion (though in many places good lab practice retains such notebooks).  Never mind that although it is creaking under the present strain of ‘spam’ and ‘nearly spam’ (the stuff that I didn’t want, but was sent by coworkers, not random strangers), we simply haven’t got anything better.

The state-of-the art

So now, in those companies, there is no email stored for more than six months, yes?  Well, no.  Of course not.  There are lots of emails which are just too valuable to delete.  And so people extract them from the mail system and store them elsewhere.  Or forward them to private email accounts.  There are, and always will be, many ways to defeat the corporate mail reaper.  The difference is that the copies are not filed systematically, are not subject to easy search by the organisation, and will probably not be disclosed to regulators or in other legal discovery processes.  This is the state-of-the art in every such organisation I’ve encountered (names omitted to protect the … innocent?).

Sooner or later, a quick-witted external lawyer is going to guess that this kind of informal archiving might help their case, and is going to manage to dig deeper into the adversary’s filestores and processes.  When they find some morsels of beef secreted in unlikely places, there will be a rush of corporate panic.

The solution

It’s easy to spot the problem.  It’s much harder to know what to do about it.  Threatening the employees isn’t very productive – especially if the ‘security’ rule is at odds with the goal of getting their work done.  Making it a sacking offence, say, to save an email outside the corporate mail  system is just going to make people more creative about what they save, and how they save it.  Unchecked retention, on the other hand, will certainly leave the organisation ‘remembering’ things it would much rather it had ‘forgotten’.

At least it would be better if the ‘punishment’ matched the crime: restricting the retention of email places the control in the wrong place.   It would be much better to reserve the stiff penalties for those engaged in libel, corporate espionage, anticompetitive behaviour, and the rest.  Undoubtedly, that would remain a corporate risk, and a place where prevention seems better than cure: the cost to the organisation may be disproportionately greater than any cost that can be imposed upon the individual.  But surely it’s the place to look, because in the other direction lies madness.

Outsourcing undermined

In the current headlong rush towards cloud services – outsourcing, in other words – leads to increasingly complex questions about what the service provider is doing with your data.  In classical outsourcing, you’d usually be able to drive to the provider’s data centre, and touch the disks and tapes holding your precious bytes (if you paid enough, anyway).  In a service-oriented world with global IT firms using data centres which follow the cheapest electricity, sometimes maybe themselves buying services from third parties, that becomes a more difficult task.

A while ago, I was at a meeting where someone posed the question “What happens when the EU’s Safe Harbour Provisions meet the Patriot Act?”.  The former is the loophole by which personal data (which normally cannot leave the EU) is allowed to be exported to data processors in third countries, provided they demonstrably meet standards equivalent to those imposed on data processors within the EU.  The latter is a far-reaching piece of legislation allowing US law enforcement agencies powers of interception and seizure of data.  The consensus at the meeting was that, of course the Patriot Act would win: the conclusion that Safe Harbour is of limited value.  Incidentally, this neatly illustrates the way that information assurance is about far more than just some crypto (or even cloud) technology.

Today, ZDNet reports that the data doesn’t even have to leave the EU for it to be within the reach of the Patriot Act:  Microsoft launched their ‘Office 365’ product, and admitted in answer to a question that data belonging to (relating to) someone in the EU, residing on Microsoft’s servers within the EU, would be surrendered by Microsoft – a US company – to US law enforcement upon a Patriot Act-compliant request.  Surely, then, any multinational (at least, those with offices? headquarters? in the US) is in the same position.  Where the subject of such a request includes personal information,  that faces them with a potential tension: they either break US law or they break EU law.  I suppose they just have to ask themselves which carries the stiffer penalties.

Now, is this a real problem or just a theoretical one? Is it a general problem with trusting the cloud, or a special case that need not delay us too long?   On one level, it’s a fairly unique degree of legal conflict, from two pieces of legislation that were rather deliberately made to be high minded and far reaching in their own domains.  But, in general, cloud-type activity is bound to raise jurisdictional conflicts: the data owner, the data processor, and the cloud service provider(s) may all be in different, or multiple, countries, and any particular legal remedy will be pursued in whichever country gives the best chance of success.

Can technology help with this?  Not as much as we might wish, I think.  The best we can hope for, I think, is an elaborate overlay of policy information and metadata so that the data owner can make rational risk-based decisions.  But that’s a big, big piece of standards work, and making it comprehensible and usable will be challenging. And, it looks like there could be at least a niche market for service providers who make a virtue of not being present in multiple jurisdictions.  In terms of trusted computing, and deciding whether the service metadata is accurate, perhaps we will need a new root of trust for location…