Outsourcing undermined

In the current headlong rush towards cloud services – outsourcing, in other words – leads to increasingly complex questions about what the service provider is doing with your data.  In classical outsourcing, you’d usually be able to drive to the provider’s data centre, and touch the disks and tapes holding your precious bytes (if you paid enough, anyway).  In a service-oriented world with global IT firms using data centres which follow the cheapest electricity, sometimes maybe themselves buying services from third parties, that becomes a more difficult task.

A while ago, I was at a meeting where someone posed the question “What happens when the EU’s Safe Harbour Provisions meet the Patriot Act?”.  The former is the loophole by which personal data (which normally cannot leave the EU) is allowed to be exported to data processors in third countries, provided they demonstrably meet standards equivalent to those imposed on data processors within the EU.  The latter is a far-reaching piece of legislation allowing US law enforcement agencies powers of interception and seizure of data.  The consensus at the meeting was that, of course the Patriot Act would win: the conclusion that Safe Harbour is of limited value.  Incidentally, this neatly illustrates the way that information assurance is about far more than just some crypto (or even cloud) technology.

Today, ZDNet reports that the data doesn’t even have to leave the EU for it to be within the reach of the Patriot Act:  Microsoft launched their ‘Office 365’ product, and admitted in answer to a question that data belonging to (relating to) someone in the EU, residing on Microsoft’s servers within the EU, would be surrendered by Microsoft – a US company – to US law enforcement upon a Patriot Act-compliant request.  Surely, then, any multinational (at least, those with offices? headquarters? in the US) is in the same position.  Where the subject of such a request includes personal information,  that faces them with a potential tension: they either break US law or they break EU law.  I suppose they just have to ask themselves which carries the stiffer penalties.

Now, is this a real problem or just a theoretical one? Is it a general problem with trusting the cloud, or a special case that need not delay us too long?   On one level, it’s a fairly unique degree of legal conflict, from two pieces of legislation that were rather deliberately made to be high minded and far reaching in their own domains.  But, in general, cloud-type activity is bound to raise jurisdictional conflicts: the data owner, the data processor, and the cloud service provider(s) may all be in different, or multiple, countries, and any particular legal remedy will be pursued in whichever country gives the best chance of success.

Can technology help with this?  Not as much as we might wish, I think.  The best we can hope for, I think, is an elaborate overlay of policy information and metadata so that the data owner can make rational risk-based decisions.  But that’s a big, big piece of standards work, and making it comprehensible and usable will be challenging. And, it looks like there could be at least a niche market for service providers who make a virtue of not being present in multiple jurisdictions.  In terms of trusted computing, and deciding whether the service metadata is accurate, perhaps we will need a new root of trust for location…

Experiences at TaPP’11

On Monday and Tuesday this week I attended the third “Theory and Practice of Provenance” workshop in Crete. The event was a great success: lively discussion from people presenting interesting and practical work.   For those who don’t know about Provenance, here’s a snappy definition:

‘Provenance’ or ‘lineage’ generally refers to information that ‘helps determine the derivation history of a data product, starting from its original sources’ . In other words, a record of where data came from and how it has been processed.

Provenance applies to many different domains, and at the TaPP’11 workshop there were researchers working on theoretical database provenance, scientific workflows, practical implementation issues, systems provenance (who want to collect provenance at the operating system level) as well as a few security people. I presented a short paper on collecting provenance in clouds, which got some useful feedback.

At the end of the event we ended with a debate on “how much provenance should we store” – with most people sitting somewhere between two extremes: either we should store just the things we think are most important to our queries, or we store everything that could possible impact what we are doing. The arguments on both side were good: there was a desire to avoid collecting too much useless data, as this slows down search and has an attached cost in terms of storage and processing. On the other hand, the point was made that we didn’t actually know how much provenance was enough, and that if we don’t collect all of it, we could come back and find we missed something. Considering the cheapness of storage and processing power, some believe that the overhead was unimportant. As a security researcher interested in trusted provenance, the “collect everything” approach seemed like my cup of tea. If the collecting agent was trusted and could attest to its proper behaviour, provenance information could be made much more tamper-resistant.

However, from the perspective of someone involved in privacy and looking at storage of context (which is a part of provenance), the preservation of privacy seemed to be an excellent reason not to collect everything. For example, I suspect that academic researchers don’t want to store all their data sources: what if you browsed Wikipedia for an overview of a subject area, and that was forever linked with your research paper? Similarly, full provenance during computation might reveal all the other programs you were using, many of which you might not want to share with your peers. Clearly some provenance information has to stay secret.

The rebuttal to this point was that this was an argument for controlled disclosure rather than controlled collection. I think this argument can occur quite often. From a logical perspective (considering only confidentiality) it might be enough to apply access controls and limit some of your provenance collection. However, this adds some interesting requirements. It is now necessary for users to specify policies on what they do and don’t want to reveal. This has shown to be difficult in practice. Furthermore, the storage of confidential data requires better security than the storage of public (if high integrity) data. The problem quickly turns into digital right management, which is easier said than implemented. I believe that controlled disclosure and controlled collection are fundamentally different approaches, and the conscientious privacy research must choose the latter.

I still believe that provenance can learn quite a lot from Trusted Computing, and vice-versa. In particular, the concept of a “root of trust” – the point at which your trust in a computing system started and the element which you may have no ability to assure – is relevent. Provenance data also must start somewhere – the first element in the history of a data item, and the trusted agent used to record it. Furthermore, the different types of root of trust are relevent: provenance is reported just like attestations report platform state. In trusted computing we have a “root of trust for reporting” and perhaps we also need one in provenance. The same is true for measurement of provenance data, and storage. Andrew Martin and I wrote about some of this in our paper at TaPP last year but there is much more to do. Could TCG attestation conform with the Open Provenance Model? Can we persuade those working in operating system provenance that the rest of the trusted computing base – the BIOS, bootloader, option roms, SMM, etc – also need to be recorded as provenance? Can the provenance community show us how to query our attested data, or make sense of a Trusted Network Connect MAP database?

Finally, one of the most interesting short talks was by Devan Donaldson, who studied whether or not provenance information actually made data more trustworthy. He performed a short study of various academic researchers, using structured interviews, and found (perhaps unsurprisingly) that yes, some provenance information really does improve the perception of trustworthiness in scientific data. He also found that a key factor in addition to provenance was the ability to use and query the new data. While these results are what we might expect, they do confirm the theory that provenance can be used to enhance perceived trustworthiness, at least in an academic setting. Whether it works outside academia is a good question: could provenance of the climategate data has reassured the press and the public?

on an unfortunate tension

It’s frustrating when you’re not allowed to use electronic devices during the first and last fifteen minutes of a flight – sometimes much longer. I rather resent having to carry paper reading material, or to stare at the wall in those periods. On today’s flight, they even told us to switch off e-book readers.

E-book readers! Don’t these people realise that the whole point of epaper is that you don’t turn it off: it consumes a minimal amount of power, so that the Kindle can survive a month on a single charge. It has no ‘off’ switch per se, its slide switch simply invoking the “screen saver” mode. This doesn’t change the power consumption by much: it just replaces the on-screen text with pictures, and disables the push buttons.

And the answer is that of course they don’t know this stuff. Why would they? Indeed, it would be absurd to expect a busy cabin attendant to be able to distinguish, say, an ebook reader from a tablet device. If we accept for a moment the shaky premise that electronic devices might interfere with flight navigation systems, then we must accept that the airlines need to ensure that as many as possible of these are swiched off – even those with no off switch to speak of, whose electromagnetic emissions would be difficult to detect at a distance of millimetres.

Of course, this is a safety argument, but much the same applies to security. Even the best of us would struggle to look at a device, look at an interface, and decide whether it is trustworthy. This, it seems to me, is a profound problem. I’m sure evolutionary psychologists could tell us in some detail about the kind of risks we are adapted to evaluate. Although we augment those talents through nurture and education, cyber threats look different every day. Children who have grown up in a digital age will have developed much keener senses for evaluating cyber-goodness than those coming to these things later in life, but we should not delude ourselves into thinking this is purely a generational thing.

People have studied the development of trust, at some length. Although the clues for trusting people seem to be quite well established, we seem to be all over the place in deciding whether to trust an electronic interface – and will tend to do so on the basis of scant evidence. (insert citations here). That doesn’t really bode well for trying to improve the situation. In many ways, the air stewardess’s cautionary approach has much to commend it, but the adoption of computing technology always seems to have been led by a ‘try it and see’ curiosity, and we destroy that at out peril.

Explaining the new rules on cookies

The European Union recently tightened the e-Privacy Directive (pdf of the full legislation), requiring user consent for the storage of cookies on websites. You could be forgiven for thinking that this is a good thing: long-lived cookies can be something of a menace, as they allow your behaviour to be tracked by websites. This kind of tracking is used for “good” things such as personalization and sessions management, as well as “bad” things like analytics and personalised marketing, which often involve sharing user details with a third-party.

However, what this legislation is certainly not going to do is stop these cookies from existing. It seems very difficult to enforce, and many websites are likely to operate an opt-out rather than opt-in consent model, no matter what the directive says.  Instead, I suspect it’s going to force conscientious (aka public sector) websites to require explicit user consent for perfectly reasonable requests to accept cookies. This well-meaning (but probably futile) legislation therefore raises the practical question: how does one ask a user for permission to store cookies?

One approach which I’m prepared to bet wont work is that taken by the UK Information Commissioner’s Office. Here’s what they display to users at the top of each screen:

The Information Commissioner's Office cookie consent form

In text:

“On 26 May 2011, the rules about cookies on websites changed. This site uses cookies. One of the cookies we use is essential for parts of the site to operate and has already been set. You may delete and block all cookies from this site, but parts of the site will not work. To find out more about cookies on this website and how to delete cookies, see our privacy notice.”

Before going further, I think it’s important to say that this is not really a criticism of the ICO website. Indeed, this is a logical approach to take when looking for user consent. The reason for the box is shown and the notice is fairly clear and concise. However, I have the following problems with it, to name just a few:

  • Cookies are not well understood by users, and probably not even the target audience of the ICO website.  Can they provide informed consent without understanding what a cookie is?
  • Why does this site use cookies?  All that this box says is that “parts of the site will not work” if cookies are blocked.  Is any user likely to want to block these cookies with this warning?  If not, why bother with the warning at all?
  • The site operates both an opt-in and an opt-out policy.  I find this surprising and a little bit confusing.  If it was considered reasonable to not warn users about the first cookie, why are the others different?
  • To really understand the question, I am expected to read the full privacy policy.  As far as privacy policies goes, this is a fairly good one, but I’m still not going to read all 1900 words of it.  I’m at the website for other reasons (to read about Privacy Impact Assessments, as it happens).
If this is the best that the Information Commissioner’s Office can do, what chance do the rest of us have?  More to the point, how does anyone obtain informed user consent for cookies without falling into the same traps?  Without a viable solution, I fear this EU legislation is going to have no impact whatsoever on those websites which do violate user privacy expectations and worse, it will punish law-abiding websites with usability problems.

About.me : how far we’ve come

about.me offers everyone a Web2.0 home page.  Besides an interesting trend in narcissism, two things strike me:

  1. Is this the ultimate page for the stalker? (or identity thief)  If  ‘yes’, then you presumably find value in the security through obscurity of having a selection of un-linked social networks.  That, in itself is an interesting discussion to have.
  2. The page links to many  leading content providers, without the need to give to about.me a single password (at last!).  Of course, the click-through for each of the sites entails giving permission to about.me to do almost anything with your account … but at least you can review and revoke that later (oauth is bullet-proof, right?! 🙂 ) .  Many of them, in turn, are happy to use Google or Facebook as authenticators (I noticed today that you can make a whole Yahoo! account just from a Google cross-authentication). It would be interesting to map what depends on what, these days.

All in all, this seems like progress of some sort.  It’s all starting to work, isn’t it?

Is about.me a good source of authoritative information about the named individual?  Hmm. I’m not sure about that: but if  ‘identity’ means anything at all, surely it means something about your ongoing and persistent relationships and interactions.

RSA gets a Chief Security Officer

Just an interesting snippet from The Register (emphasis mine):

RSA has appointed its first chief security officer, three months after a data theft on its network contributed to the hack of the world’s biggest defense contractor, and possibly other important customers.

http://www.theregister.co.uk/2011/06/10/rsa_chief_security_officer/

I’ve been telling people for ages that having a CISO is normal good practice these days.  Evidently nobody told the security industry.

Andrew can’t encrypt, either

A seminal paper once explained why Johnny can’t encrypt.  I thought I could, but the combined forces of mail clients and certification authorities seem to be working together to confound me.

Although PGP is the grand-daddy of email security, its core security model – of a web-of-trust, each user authenticating their friends’ public keys – was problematic at least from a scalability perspective.  I had a PGP key, once, a long time ago, but barely used it – and barely had anyone to talk to.  S/MIME offered a more corporate-feeling alternative: an email signing and encryption scheme built into most mail clients and based around X.509 certificates backed by major vendors.

S/MIME was something I could use.  I used to have a Thawte ‘Freemail’ certificate: these were issued for free, on the strength of being able to receive email at a specified address. If you wanted your ‘real name’ in the certificate, Thawte had its own ‘web of trust’ through which your identity could be verified.  This kind-of worked, and I used the certificates with Thunderbird quite happily.  I found that quite a number of people were sending me signed emails, and Thunderbird was able to verify the signatures; and from time to time I even found a correspondent who wanted to exchange encrypted messages.  All of my outgoing mail was signed, for a year or two.

Well, almost all of my outgoing mail was signed.  Occasionally I would use a client – such as Gmail – which didn’t facilitate that.  In that time, not one email recipient complained of receiving an unsigned email: if signatures were the norm, you’d have thought that someone would have spotted the anomaly and questioned whether the message truly came from me.

But that arrangement came to an end: Thawte stopped issuing Freemail certificates; and Thunderbird 3.0 was so difficult to use that I abandoned it in favour of the mail client everyone loves (?): Outlook.  The Outlook handling of certificates is a little obscure, but in Office 2007, it was quite functional.  When I upgraded to Office 2010, not only did I start receiving increasingly cryptic error messages, several recipients told me that my signed messages appeared blank to them.

So I stopped signing.  I retained the certificate (and corresponding keys) I had, lest anyone send me an encrypted message.  This, they continue to do occasionally, but now I get further cryptic error messages and no sign of a decrypted email.

Where do the certificates come from?  Well, Comodo continue to offer a free email client certificate.  I have one, so I must have managed to persuade their software to issue one, once upon a time.  But today I am totally failing to manage that: the certificate is issued, but I get a browser error when I try to download and install it.  This is before I attempt the awkward feat of trying to transfer it from browser to mail client (if that step is still needed).  Even trying to retrieve an old certificate runs me up against requests for long-forgotten passwords.

This is a long tale of woe, and I have omitted many of the gory details.  The upshot is that I, who understand the workings of email really quite well, and the principles of cryptography and X.509 certificates, and the broad design of my web browser and email client in this area, am neither able to sign nor encrypt (nor decrypt) email today. I’m using mainstream software and the apparent best efforts of major vendors, but the outcome is quite unusable to me.

I’ve invested quite a few hours in trying to make this work.  I have a hunch that with a few more hours’ effort I might get somewhere – but my confidence in that is ebbing away.  Andrew can’t encrypt.  🙁