The main buzz around the latest dispatch from WikiLeaks is about the content – and I have to agree with most people who have commented on it – the response amounts to “Yawn, really, that’s what all the fuss is about?”
The process of the leak itself is more interesting. This was a mass download of a bunch of data that various US government agencies were intentionally sharing. Sharing is good, especially for low-risk data such as this. On the other hand, the US government didn’t actually want the data to leak outside of itself, and given the thousands of people with access, that’s a tall order.
So how do you share something with thousands of people while still minimizing the chances that one of them will release it?
Well …. first, you should change the access method to be “one document at a time” rather than “all at once.” I have to assume they actually did do that – but someone scripted a bulk download of these documents.
The second step is to impose some sort of economic cost on anyone considering a breach of protocol by releasing the content. This is where some people jump up and yell “Digital Rights Management!” and where I claim “No! DRM Sucks!” 😉 Actually, I think a much more benign solution is to apply a hard-to-detect, hard-to-remove watermark to individual documents downloaded from this sort of database. Basically, if I download a file from this database, the file should be marked up in some way to indicate that it was me who downloaded it. Anyone can read it – but at least people in authority should be able to figure out that it is my download they are reading.
Same thing with the WikiLeaks documents – if the feds had used a file format that allows for watermarking and had marked up downloaded documents, then legitimate users, including whoever actually leaked the content, wouldn’t have been so eager to let the cat out of the bag.
Technologically, you need some sort of watermarking system and, of course, an identity and access system — users have to identify themselves and authenticate before they can download this stuff, else the central server wouldn’t know what to put in the watermark.
In fact, this raises another question – don’t they log who downloads content? If they don’t, then they deserve the outcome they got. If they do log, then they should already know who downloaded all this content.
That’s my $0.02 for today.