I just landed on a flight from Toronto to San Francisco. If you were inside the USA you may not have heard about the various crazy rules applied to travel to the USA, or at least not experienced them. While we were away the rules changed every day, and perhaps every hour.
There is some controversy, including a critique from our team at the EFF of Facebook's new privacy structure, and their new default and suggested policies that push people to expose more of their profile and data to "everyone."
I understand why Facebook finds this attractive. "Everyone" means search engines like Google, and also total 3rd party apps like those that sprung up around Twitter.
There are a variety of tools that offer encrypted filesystems for the various OSs. None of them are as easy to use as we would like, and none have reached the goal of "Zero User Interface" (ZUI) that is the only thing which causes successful deployment of encryption (ie. Skype, SSH and SSL.)
Many of these tools have a risk of failure if you don't also encrypt your swap/paging space, because your swap file will contain fragments of memory, including encrypted files and even in some cases decryption keys. There is a lot of other confidential data which can end up in swap -- web banking passwords and just about anything else.
It's not too hard to encrypt your swap on linux, and the ecryptfs tools package includes a tool to set up encrypted swap (which is not done with ecryptfs, but rather with dm-crypt, the block-device encryptor, but it sets it up for you.)
However, I would propose that swap be encrypted by default, even if the user does nothing. When you boot, the system would generate a random key for that session, and use it to encrypt all writes and reads to the swap space. That key of course would never be swapped out, and furthermore, the kernel could even try to move it around in memory to avoid the attacks the EFF recently demonstrated where the RAM of a computer that's been turned off for a short time is still frequently readable. (In the future, computers will probably come with special small blocks of RAM in which to store keys which are guaranteed -- as much as that's possible -- to be wiped in a power failure, and also hard to access.)
The automatic encryption of swap does bring up a couple of issues. First of all, it's not secure with hibernation, where your computer is suspended to disk. Indeed, to make hibernation work, you would have to save the key at the start of the hibernation file. Hibernation would thus eliminate all security on the data -- but this is no worse than the situation today, where all swap is insecure. And many people never hibernate.
(Update: I had a formatting error in the original posting, this has been fixed.)
A few weeks ago when I wrote about the non deployment of SSL I touched on an old idea I had to make web transactions vastly more efficient. I recently read about Google's proposed SPDY protocol which goes in a completely opposite direction, attempting to solve the problem of large numbers of parallel requests to a web server by multiplexing them all in a single streaming protocol that works inside a TCP session.
While calling attention to that, let me outline what I think would be the fastest way to do very simple web transactions. It may be that such simple transactions are no longer common, but it's worth considering.
Today the way this works is pretty complex:
- You do a DNS request for www.example.com via a UDP request to your DNS server. In the pure case this also means first asking where ".com" is but your DNS server almost surely knows that. Instead, a UDP request is sent to the ".com" master server.
- The ".com" master server returns with the address of the server for example.com.
- You send a DNS request to the example.com server, asking where "www.example.com is."
- The example.com DNS server sends a UDP response back with the IP address of www.example.com
- You open a TCP session to that address. First, you send a "SYN" packet.
- The site responds with a SYN/ACK packet.
- You respond to the SYN/ACK with an ACK packet. You also send the packet with your HTTP "GET" reqequest for "/page.html." This is a distinct packet but there is no roundtrip so this can be viewed as one step. You may also close off your sending with a FIN packet.
- The site sends back data with the contents of the page. If the page is short it may come in one packet. If it is long, there may be several packets.
- There will also be acknowledgement packets as the multiple data packets arrive in each direction. You will send at least one ACK. The other server will ACK your FIN.
- The remote server will close the session with a FIN packet.
- You will ACK the FIN packet.
You may not be familiar with all this, but the main thing to understand is that there are a lot of roundtrips going on. If the servers are far away and the time to transmit is long, it can take a long time for all these round trips.
It gets worse when you want to set up a secure, encrypted connection using TLS/SSL. On top of all the TCP, there are additional handshakes for the encryption. For full security, you must encrypt before you send the GET because the contents of the URL name should be kept encrypted.
A simple alternative
Consider a protocol for simple transactions where the DNS server plays a role, and short transactions use UDP. I am going to call this the "Web Transaction Protocol" or WTP. (There is a WAP variant called that but WAP is fading.)
- You send, via a UDP packet, not just a DNS request but your full GET request to the DNS server you know about, either for .com or for example.com. You also include an IP and port to which responses to the request can be sent.
- The DNS server, which knows where the target machine is (or next level DNS server) forwards the full GET request for you to that server. It also sends back the normal DNS answer to you via UDP, including a flag to say it forwarded the request for you (or that it refused to, which is the default for servers that don't even know about this.) It is important to note that quite commonly, the DNS server for example.com and the www.example.com web server will be on the same LAN, or even be the same machine, so there is no hop time involved.
- The web server, receiving your request, considers the size and complexity of the response. If the response is short and simple, it sends it in one UDP packet, though possibly more than one, to your specified address. If no ACK is received in reasonable time, send it again a few times until you get one.
- When you receive the response, you send an ACK back via UDP. You're done.
The above transaction would take place incredibly fast compared to the standard approach. If you know the DNS server for example.com, it will usually mean a single packet to that server, and a single packet coming back -- one round trip -- to get your answer. If you only know the server for .com, it would mean a single packet to the .com server which is forwarded to the example.com server for you. Since the master servers tend to be in the "center" of the network and are multiplied out so there is one near you, this is not much more than a single round trip.
I just returned from Jeff Pulver's "140 Characters" conference in L.A. which was about Twitter. I asked many people if they get Twitter -- not if they understand how it's useful, but why it is such a hot item, and whether it deserves to be, with billion dollar valuations and many talking about it as the most important platform.
Some suggested Twitter is not as big as it appears, with a larger churn than expected and some plateau appearing in new users. Others think it is still shooting for the moon.
I have written before about how overzealous design of cryptographic protocols often results in their non-use. Protocol engineers are trained to be thorough and complete. They rankle at leaving in vulnerabilities, even against the most extreme threats. But the perfect is often the enemy of the good. None of the various protocols to encrypt E-mail have ever reached even a modicum of success in the public space. It's a very rare VoIP call (other than Skype) that is encrypted.
The two most successful encryption protocols in the public space are SSL/TLS (which provide the HTTPS system among other things) and Skype. At a level below that are some of the VPN applications and SSH.
TLS (the successor to SSL) is very widely deployed but still very rarely used. Only the most tiny fraction of web sessions are encrypted. Many sites don't support it at all. Some will accept HTTPS but immediately push you back to HTTP. In most cases, sites will have you log in via HTTPS so your password is secure, and then send you back to unencrypted HTTP, where anybody on the wireless network can watch all your traffic. It's a rare site that lets you conduct your entire series of web interactions entirely encrypted. This site fails in that regard. More common is the use of TLS for POP3 and IMAP sessions, both because it's easy, there is only one TCP session, and the set of users who access the server is a small and controlled set. The same is true with VPNs -- one session, and typically the users are all required by their employer to use the VPN, so it gets deployed. IPSec code exists in many systems, but is rarely used in stranger-to-stranger communications (or even friend-to-friend) due to the nightmares of key management.
TLS's complexity makes sense for "sessions" but has problems when you use it for transactions, such as web hits. Transactions want to be short. They consist of a request, and a response, and perhaps an ACK. Adding extra back and forths to negotiate encryption can double or triple the network cost of the transactions.
Skype became a huge success at encrypting because it is done with ZUI -- the user is not even aware of the crypto. It just happens. SSH takes an approach that is deliberately vulnerable to man-in-the-middle attacks on the first session in order to reduce the UI, and it has almost completely replaced unencrypted telnet among the command line crowd.
I write about this because now Google is finally doing an experiment to let people have their whole gmail session be encrypted with HTTPS. This is great news. But hidden in the great news is the fact that Google is evaluating the "cost" of doing this. There also may be some backlash if Google does this on web search, as it means that ordinary sites will stop getting to see the search query in the "Referer" field until they too switch to HTTPS and Google sends traffic to them over HTTPS. (That's because, for security reasons, the HTTPS design says that if I made a query encrypted, I don't want that query to be repeated in the clear when I follow a link to a non-encrypted site.) Many sites do a lot of log analysis to see what search terms are bringing in traffic, and may object when that goes away.
Yesterday it was announced that "Clear" (Verified ID Pass) the special "bypass the line at security" card company, has shut its doors and its lines. They ran out of money and could not pay their debts. No surprise there, they were paying $300K/year rent for their space at SJC and only 11,000 members used that line.
As I explained earlier, something was fishy about the program. It required a detailed background check, with fingerprint and iris scan, but all it did was jump you to the front of the line -- which you get for flying in first class at many airports without any background check. Their plan, as I outline below, was to also let you use a fancy shoe and coat scanning machine from GE, so you would not have to take them off. However, the TSA was only going to allow those machines once it was verified they were just as secure as existing methods -- so again no need for the background check.
To learn more about the company, I attended a briefing they held a year ago for a contest they were holding: $500,000 to anybody who could come up with a system that sped up their lines at a low enough cost. I did have a system, but also wanted to learn more about how it all worked. I feel sorry for those who worked hard on the contest who presumably will not be paid.
The background check
The usual approach to authentication online is the "login" approach -- you enter userid and password, and for some "session" your actions are authenticated. (Sometimes special actions require re-authentication, which is something my bank does on things like cash transfers.) This is so widespread that all browsers will now remember all your passwords for you, and systems like OpenID have arise to provide "universal sign on," though to only modest acceptance.
Another approach which security people have been trying to push for some time is authentication via digital signature and certificate. Your browser is able, at any time, to prove who you are, either for special events (including logins) or all the time. In theory these tools are present in browsers but they are barely used. Login has been popular because it always works, even if it has a lot of problems with how it's been implemented. In addition, for privacy reasons, it is important your browser not identify you all the time by default. You must decide you want to be identified to any given web site.
I wrote earlier about the desire for more casual athentication for things like casual comments on message boards, where creating an account is a burden and even use of a universal login can be a burden.
I believe an answer to some of the problems can come from developing a system of authenticated actions rather than always authenticating sessions. Creating a session (ie. login) can be just one of a range of authenticated actions, or AuthAct.
To do this, we would adapt HTML actions (such as submit buttons on forms) so that they could say, "This action requires the following authentication." This would tell the browser that if the user is going to click on the button, their action will be authenticated and probably provide some identity information. In turn, the button would be modified by the browser to make it clear that the action is authenticated.
An example might clarify things. Say you have a blog post like this with a comment form. Right now the button below you says "Post Comment." On many pages, you could not post a comment without logging in first, or, as on this site, you may have to fill other fields in to post the comment.
In this system, the web form would indicate that posting a comment is something that requires some level of authentication or identity. This might be an account on the site. It might be an account in a universal account system (like a single sign-on system). It might just be a request for identity.
Your browser would understand that, and change the button to say, "Post Comment (as BradT)." The button would be specially highlighted to show the action will be authenticated. There might be a selection box in the button, so you can pick different actions, such as posting with different identities or different styles of identification. Thus it might offer choices like "as BradT" or "anonymously" or "with pseudonym XXX" where that might be a unique pseudonym for the site in question.
Now you could think of this as meaning "Login as BradT, and then post the comment" but in fact it would be all one action, one press. In this case, if BradT is an account in a universal sign-on system, the site in question may never have seen that identity before, and won't, until you push the submit button. While the site could remember you with a cookie (unless you block that) or based on your IP for the next short while (which you can't block) the reality is there is no need for it to do that. All your actions on the site can be statelessly authenticated, with no change in your actions, but a bit of a change in what is displayed. Your browser could enforce this, by converting all cookies to session cookies if AuthAct is in use.
Note that the first time you use this method on a site, the box would say "Choose identity" and it would be necessary for you to click and get a menu of identities, even if you only have one. This is because a there are always tools that try to fake you out and make you press buttons without you knowing it, by taking control of the mouse or covering the buttons with graphics that skip out of the way -- there are many tricks. The first handover of identity requires explicit action. It is almost as big an event as creating an account, though not quite that significant.
You could also view the action as, "Use the account BradT, creating it if necessary, and under that name post the comment." So a single posting would establish your ID and use it, as though the site doesn't require userids at all.
I recently attended the eComm conference on new telephony. Two notes in presentations caught my attention, though they were mostly side notes. In one case, the presenter talked about the benefits of having RFID tags in everything.
"Your refrigerator," he said, "could read the RFID and know if your milk was expired." In the old days we just looked at the date or smelled it.
I've written about "data hosting/data deposit box" as an alternative to "cloud computing." Cloud computing is timesharing -- we run our software and hold our data on remote computers, and connect to them from terminals. It's a swing back from personal computing, where you had your own computer, and it erases the 4th amendment by putting our data in the hands of others.
It's been a remarkably dramatic year at the EFF. We worked in a huge number of areas, acting on or participating in a lot of cases. The most famous is our ongoing battle over the warrantless wiretapping scandal, where we sued AT&T for helping the White House. As you probably know, we certainly got their attention, to the point that President Bush got the congress to pass a law granting immunity to the phone companies. We lost that battle, but our case still continues, as we're pushing to get that immunity declared unconstitutional.
Ford is making a new car-limiting system called MyKey standard in future models. This allows the car owner to enable various limits and permissions on the keys they give to their teen-agers. Limits included in the current system include an 80 mph speed limit, a 40% volume limit on the stereo, never-ending seatbelt reminders, earlier low-fuel warnings, audio speed alerts and inability to disable various safety systems.
Most of us have had to stand in a long will-call line to pick up tickets. We probably even paid a ticket "service fee" for the privilege. Some places are helping by having online printable tickets with a bar code. However, that requires that they have networked bar code readers at the gate which can detect things like duplicate bar codes, and people seem to rather have giant lines and many staff rather than get such machines.
Can we do it better?
There's a bit of an internet buzz this week around a video of a law lecture on why you should never, ever, ever, ever talk to the police. The video begins with the law professor and criminal defense attorney, who is a good speaker, making that case, and then a police detective, interesting but not quite as eloquent, agreeing with him and describing the various tricks the police use every day with people stupid enough to talk to them.
There are a variety of tools out there to help recover stolen technological devices. They make the devices "phone home" to the security company, and if stolen, this can be used to find the laptop (based on IP traceroutes etc.) and get it back. Some of these tools work hard to hide on the machine, even claiming they will survive low level disk formats. Some reportedly get installed into the BIOS to survive a disk swap.
Sadly, I must report that after our initial success in getting the members of the House to not grant immunity to telcos who participated in the illegal warrentless wiretap program which we at the EFF are suing over, the attempt to join the Senate bill (which grants immunity) to the House bill has, by reports, resulted in a so-called compromise that effectively grants the immunity.
Recently we at the EFF have been trying to fight new rulings about the power of U.S. customs. Right now, it's been ruled they can search your laptop, taking a complete copy of your drive, even if they don't have the normally required reasons to suspect you of a crime. The simple fact that you're crossing the border gives them extraordinary power.
We would like to see that changed, but until then what can be done? You can use various software to encrypt your hard drive -- there are free packages like truecrypt, and many laptops come with this as an option -- but most people find having to enter a password every time you boot to be a pain. And customs can threaten to detain you until you give them the password.
There are some tricks you can pull, like having a special inner-drive with a second password that they don't even know to ask about. You can put your most private data there. But again, people don't use systems with complex UIs unless they feel really motivated.
What we need is a system that is effectively transparent most of the time. However, you could take special actions when going through customs or otherwise having your laptop be out of your control.
I've been ranting of late about the dangers inherent in "Data Portability" which I would like to rename as BEPSI to avoid the motherhood word "portability" for something that really has a strong dark side as well as its light side.
But it's also important to come up with an alternative. I think the best alternative may lie in what I would call a "data deposit box" (formerly "data hosting.") It's a layered system, with a data layer and an application layer on top. Instead of copying the data to the applications, bring the applications to the data.
A data deposit box approach has your personal data stored on a server chosen by you. That server's duty is not to exploit your data, but rather to protect it. That's what you're paying for. Legally, you "own" it, either directly, or in the same sense as you have legal rights when renting an apartment -- or a safety deposit box.
Your data box's job is to perform actions on your data. Rather than giving copies of your data out to a thousand companies (the Facebook and Data Portability approach) you host the data and perform actions on it, programmed by those companies who are developing useful social applications.
As such, you don't join a site like Facebook or LinkedIn. Rather, companies like those build applications and application containers which can run on your data. They don't get the data, rather they write code that works with the data and runs in a protected sandbox on your data host -- and then displays the results directly to you.
To take a simple example, imagine a social application wishes to send a message to all your friends who live within 100 miles of you. Using permission tokens provided by you, it is able to connect to your data host and ask it to create that subset of your friend network, and then e-mail a message to that subset. It never sees the friend network at all.