So, are you on Facebook?

This post is entirely inspired by David Ascher’s talk on “Messaging with Mozilla Values” at the Mozilla summit, and is also a result of me deciding not to simply let blog posts sit as drafts forever (looks like I haven’t made a post in a while!)

It’s been a couple of months since I deleted my facebook account. It wasn’t an impulse decision, and I had been mulling over it for a while before actually going through with it. I don’t really miss it, which I suppose, is a good thing. As a technology enthusiast however, I do want to keep up with what the biggest social network in the world is up to. Create an alias you say? While signing up for one, I noticed that the form very strongly notes that one is to use their “real name only”. Interesting. I’m just going to wait and see if they decide that I am a fake person.

All that is well and good, but my conversations with new and interesting young people I meet almost always ends with “are you on facebook?”. My response is usually met with either mild surprise or a sliver of disappointment. I quickly explain that it doesn’t mean they can’t get in touch with me but that leads to a look that I’ve come to interpret as “yeah right”, also known as “ugh, email”. Which explains why my personal inbox only contains messages from my mom, the British lottery council and my very wealthy friends in Nigeria.

This leads us to the burning question of why “standard” messaging protocols like SMTP have failed (or rather failed to evolve) to capture the interest of this generation. As Chris Beard very succinctly put, if you had said we’d be using a single Internet based service to communicate with each other in the 90s we’d have thought you were crazy, simply because we were just finished with the nightmare that was AOL. Yet, only a couple of decades later it seems we’re at a full circle.

Hypothetically speaking, if Google had decided that Gmail could only be used to send messages to other Gmail users, would it have gained as much traction as it did? Yet, millions of Facebook users seem to miss the absurdity in the fact that they can’t use the service to talk to anybody who is not a member of the network. If users fail to recognize this simple drawback, it must mean that playing the ‘privacy’ trumpet or the ‘centralized control’ horn is just wasted effort.

What can we as computer scientists do about the situation (or does the situation even need our attention)? David very rightly points out that the last thing on our minds should be to ask users to stop doing things they absolutely love. I enjoyed facebook. A lot. There is a reason (actually several) for why the service is so popular. It certainly seems to me that understanding the basics of why such a service is a grand success is an interesting exercise in itself.

So, are you on facebook? What do you love about it? What do you think could be better? Do you see initiatives like diaspora succeeding?

Figuring out the Goo.gl API

UPDATE: ‘Fatalis’ has pointed out in the comments that the POST should be made to http://goo.gl/api/url with User-agent set to ‘toolbar’. The code now works, Yay!

Google just announced their own URL shortening service. Their service can only be used from the toolbar or FeedBurner, and I don’t particularly like adding extra toolbars to my browser. Maybe I can figure out a way to use their service from the command line?

I downloaded the toolbar XPI, unzipped it and peeked inside. Horribly indented JS awaited me. Nothing jsbeautifier couldn’t fix though. Few minutes later, I arrived at this readable JS function:

var getUrlShorteningRequestParams = function (b) {
    function c() {
        for (var l = 0, m = 0; m  0 ? l : l + 4294967296);
        for (var o = 0, n = false, p = m.length - 1; p >= 0; --p) {
            var q = Number(m.charAt(p));
            if (n) {
                q *= 2;
                o += Math.floor(q / 10) + q % 10
            } else o += q;
            n = !n
        }
        m = m = o % 10;
        o = 0;
        if (m != 0) {
            o = 10 - m;
            if (l.length % 2 == 1) {
                if (o % 2 == 1) o += 9;
                o /= 2
            }
        }
        m = String(o);
        m += l;
        return l = m
    }
    function e(l) {
        for (var m = 5381, o = 0; o < l.length; o++)
            m = c(m << 5, m, l.charCodeAt(o));
        return m
    }
    function f(l) {
        for (var m = 0, o = 0; o < l.length; o++)
            m = c(l.charCodeAt(o), m << 6, m << 16, -m);
        return m
    }

    var i = e(b);
    i = i >> 2 & 1073741823;
    i = i >> 4 & 67108800 | i & 63;
    i = i >> 4 & 4193280 | i & 1023;
    i = i >> 4 & 245760 | i & 16383;

    var h = f(b);
    var k = (i >> 2 & 15) << 4 | h & 15;
    k |= (i >> 6 & 15) << 12 | (h >> 8 & 15) << 8;
    k |= (i >> 10 & 15) << 20 | (h >> 16 & 15) << 16;
    k |= (i >> 14 & 15) << 28 | (h >> 24 & 15) << 24;
    j = "7" + d(k);

    i = "user=toolbar@google.com&url=";
    i += encodeURIComponent(b);
    i += "&auth_token=";
    i += j;
    return i
};

So, I call getUrlShorteningRequestParams("http://www.kix.in/"); to get "user=toolbar@google.com&url=http%3A%2F%2Fwww.kix.in%2F&auth_token=78925814685". I see in their code that they do a POST request to the service to obtain a JSON return value that would contain the short URL. I punch it in using cURL:

$ curl -v -d\
   "user=toolbar@google.com&url=http%3A%2F%2Fwww.kix.in%2F&;\
   auth_token=78925814685" http://goo.gl/
* About to connect() to goo.gl port 80 (#0)
*   Trying 74.125.19.102... connected
* Connected to goo.gl (74.125.19.102) port 80 (#0)
> POST / HTTP/1.1
> User-Agent: curl/7.19.7 (i386-apple-darwin10.2.0) libcurl/7.19.7
> Host: goo.gl
> Accept: */*
> Content-Length: 77
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 405 HTTP method POST is not supported by this URL

Oops! Well, not really, the URL shortener from the toolbar doesn’t work either, I just get the full URL whenever I try to “share” something. Has anybody actually generated a real goo.gl short URL yet?

Their auth_token parameter seems completely superfluous to me as it is generated from the URL itself. Don’t we all know security by obscurity doesn’t work :)

Go: Why I ♥ Google

Christmas came early this year.

Glenda2Go

Today, Google announced their new open source systems programming language: Go. I’m super excited about this, we all have been wondering what Rob Pike has been upto since he joined the big G, and now we know. Not just that, but Ken Thomson, Robert Griesemer, Ian Taylor and Russ Cox were all involved in the project, with Ken doing what he does best, writing compilers in lightning speed ;) If that isn’t a list of heavyweight respectable computer scientists, I don’t know what is!

I think Go is poised to be the dominant systems programming language of the future. Go has nailed almost every aspect of a systems language, though some would say I’m biased. Go has been strongly influenced by Oberon, CSP languages like Limbo, and the standard libraries have tantalizing similarities to Plan 9. We’ve had Limbo and Plan 9 for a while now (more than a decade), but this is where my real love for Google begins to bubble, they took something awesome but unpopular and gave it a push to the masses. There are very few companies in the world who would attract the talent to do this, and even fewer who would open source the results. The attention Go has been getting is just mind blowing. Pike had been doing amazing work at Bell-Labs for quite a while, but none of it even got an inkling of the publicity Go is currently getting.

Google was what Pike needed to prove Utah2000 wrong.

I know one thing for sure, I’ll definitely be using my Plan 9 virtual machine a lot less; now that I can write clean concurrent programs that don’t make my head hurt, both in Linux and OS X. And GCC, I’m not shedding any tears while I bid you goodbye.

On another note, Google also announced today that they’ll be sponsoring free WiFi at a whole bunch of US airports this holiday season. For all its faults, Google definitely seems to be doing the right thing. For how long, it remains to be seen, but so far I’d say their track record has been better than excellent.

UPDATE: John Gruber points out that “judging from the copyright statements, [Go is] not an official Google project”. Could this be a result of the famous 20% time scheme?

Identity on the web is broken

The mere presence of systems like OpenID, Facebook Connect and a host of other identity services on the web today is attestation to the fact.

Authentication should be a feature of the protocol, not something that relies on hacks like cookies. 99% of websites today rely on cookies for authentication for their websites, besides offering custom registration and login pages. This means the browser, as the user’s agent, has no clue of what is going on. A user is forced to manually track myriads of accounts, remember passwords for each of them, and remember what personal information each of them holds. Sure, part of the problem is solved by using password managers (like the one in-built into Firefox, or external programs like 1Password), but even these programs rely on heuristic algorithms to determine if something looks like a login credential or not. There’s no explicit way for web pages to tell your browser: “This is a login form, please fill in details of the user’s identity here” or “These pages are privileged, please give me the user’s identity”. Why is that?

Actually, there is such a mechanism: HTTP based Authentication has been a feature present since HTTP/1.0, but only 1% of sites actually use it. The reason for that is purely cosmetic, most browsers display a very bland modal dialog when it encounters a page that requires HTTP Auth, and sites are unable to customize that interaction. So, the technically right way to do things sucks from a user experience perspective, and websites started adopting alternate means. Someone discovered they could use cookies to store session information on the client, and the whole situation exploded ever since. As a programmer, I feel very sad when I think about the fact that instead of fixing the problem in HTTP/1.1, web-based authentication took the route it did and led to the mess we are in today.

However, I must also state that HTTP authentication doesn’t solve the entire problem – there is still the issue of users having to create an account for every site they want to be part of. This is because there existed no protocols to federate and provide decentralized authentication. That is, until OpenID and OAuth came about. Now we’re at this exciting juncture, and the browser is in a unique position to use these tools together to provide the user with an experience that is secure and easy to use. Every architect will agree that it is indeed a fun challenge to use the state of identity on the web today and make it into something awesome.

This is precisely what the Mozilla Labs team has been thinking about for a while now. Sometime ago, we added support for automagic one-click OpenID logins to Weave. We plan to spin that “feature” out into it’s own extension and build on it, something we call “Weave Identity“, part of the broader “Open Identity” initiative by the Labs. “Weave Sync“, the original extension, will just focus on the synchronization parts so we can tackle these two different problems separately.

So, how exactly are we planning on doing this? Take a look at an initial version of a document describing an in-browser “Account Manager“. We’ve also put up a WEP (which expands to Weave Enhancement Proposal, by the way) describing the raw form of a specification for automatic actions on websites, such as user registration or password changes.

Keep in mind that all of this is in its very early stages (pre-alpha); but that also means it’s a great opportunity for the community to get involved! What are your thoughts on Open Identity? Use the discussion tab on any of those Wiki pages, start a thread on the Mozilla Labs group, or simply leave a comment on this blog entry, and chip in – we’d love to hear from you!

GSoC Mentor Summit ’09 Roundup

The grand Summer of Code Mentor Summit of 2009 concluded last week and I had the fantastic opportunity of being able to attend on behalf of Gentoo, Plan 9 and Mozilla. What follows is some indication of how awesome the summit was:

(Photo courtesy of warthog from Etherboot)

I met so many folks I’d only interacted with online so far (the classic nickname-to-face matching), but even better was the opportunity to meet folks powering open source projects from so many diverse backgrounds. I met many of my personal rockstars, and learned about a bunch of open source projects I’d never heard of :)

Also, one of the things that is only possible at an event like the summit was the ability to get a whole bunch of non-linux operating system groups in one room. We had a great discussion, and it resulted in the creation of the “rosetta-os” special interest group. Look for more activity on the common device drivers for non-linux operating systems front soon!

Other sessions worthy of special mention were Open Source Security, Recruiting and Retaining Awesome People, Advanced Trolling (yes, you read that right), and of course the always welcoming Casablanca where I spent most of my time. We discussed everything from our SoC experiences to the Afro Celt Sound System in that room, always full of creative energy and warmth.

After 4 years of participating in the Summer of Code, I am super happy to have finally met the faces behind the program. Every single person I met over the course of last weekend was friendly, intelligent and just generally awesome; that sort of thing doesn’t happen by chance. I feel warm and fuzzy inside to think that I’m actually a part of the revolution that is free and open source software, three cheers to everyone that made it possible!

How does Weave use Cryptography?

I’m back from the EU MozCamp in Prague and we all had a great time! Check out the slides from my talks: Labs Overview and Weave in Depth.

A few people at the MozCamp were interested in Weave’s use of cryptography to protect the user’s data and privacy. Although the specs for the Weave server are available, it may take someone new a while to wrap their head around the whole scheme. I’m going to attempt explaining what crypto operations we do and why we do it in this blog post.

First, let’s get some basic definitions out of the way. Symmetric cryptography means you have one key that can perform both encryption and decryption, and they are complementary operations. For Weave, we use AES with a 256 bit key, and we use it in a mode that requires an ‘initialization vector’ for every decryption. Asymmetric cryptography means there’s a pair of keys (usually called ‘public’ and ‘private’ keys). A piece of text “encrypted” by one key can only be “decrypted” by the other key. Here, we use RSA with a 2048 bit private key.

So, when a user first signs up for Weave using the wizard on their computer, we generate a (random) pair of public and private keys. Next, we use the user’s passphrase to create a symmetric key. This is done using a pretty standard algorithm known as PBKDF2 (short for “Password Key Derivation Function”). The PBKDF2 algorithm requires a ‘salt’ value which is also stored on the server. Now that we have a symmetric key, we use it to encrypt the user’s private key and upload it along with the public key to the server. Note that the passphrase is never sent to the server, so if the user’s password ever gets compromised all the attacker can get is their encrypted private key, which really isn’t of much use (especially given that the key is 2048 bits long).

Whenever a particular “engine” is to be synchronized (an engine could be Tabs, Bookmarks, History etc.) we generate a random symmetric key for that engine. This key is then encrypted using the user’s public key (now, one can only retrieve the original symmetric key with the corresponding private key) and uploaded as being associated with a particular engine. All entries (the ‘ciphertext’ property in a “Weave Basic Object”) in that engine are encrypted with the symmetric key that was generated for it.

To make things clear, let’s enumerate the steps we would take to decrypt a single tab object for user ‘foo’:

  1. Find the user’s cluster by making a GET request to https://services.mozilla.com/user/1/foo/node/weave. It returns https://sj-weave06.services.mozilla.com/.
  2. Fetch the user’s encrypted private key and public key from https://sj-weave06.services.mozilla.com/0.5/foo/storage/keys/privkey and https://sj-weave06.services.mozilla.com/0.5/foo/storage/keys/pubkey respectively. The user’s password is required to access these JSON objects.
  3. Ask the user for their passphrase and generate a 256 bit symmetric key from it using PBKDF2 and the ‘salt’ found in the privkey object.
  4. Use the generated symmetric key and the initialization vector found in the ‘iv’ property of the privkey object to decrypt the user’s private key.
  5. Fetch the user’s encrypted tab objects from https://sj-weave06.services.mozilla.com/0.5/foo/storage/tabs/?full=1.
  6. Fetch the corresponding symmetric key (the URL is also listed in the “encryption” property of every WBO), in this case https://sj-weave06.services.mozilla.com/0.5/foo/storage/crypto/tabs.
  7. Decrypt the symmetric key with the user’s private key.
  8. Use the decrypted symmetric key to decrypt any WBO from the tabs collection with the initialization vector found in the ‘bulkIV’ property of the tabs symmetric key WBO.
  9. Profit.

A word about the formats in which the keys are actually stored in. All values are Base64. For symmetric keys, the key is stored as-is. For asymmetric keys, I wish we used a standard format like PKCS#12, but we don’t. It’s still ASN.1 though, in some format NSS exports private keys in. You need to do a bit of ASN.1 parsing to figure out the values you’re interested in.

Fortunately, I’ve already figured out most of the details for you – check out my Javascript or PHP implementations of the crypto elements required to decrypt Weave Basic Objects.

Finally, a quick note about why we do all this. Sharing is now reasonably easy, if you want to share your bookmarks with someone, you just need to encrypt the corresponding symmetric key with their public key and they’re good to go. Also, each WBO has it’s own ‘encryption’ property so this can be as granular as needed. Secondly, the passphrase is never stored anywhere (except possibly on the user’s computer) so the server never sees anything other than encrypted blobs of Base64′ed text. Along with making HTTPS mandatory, we think this is a pretty secure way of protecting the user’s data.

If you have other encryption schemes that might fit into Weave’s use cases please let us know! (We’ve already been looking at interesting developments in this area such as Tahoe). I’d also love to hear from you if you have any questions on our current cryptography scheme. We’re constantly trying to improve the security and efficiency of our system so these details are only valid until we change our scheme :-)

Now, go write that third-party Weave client, you have no excuse not to!

Follow

Get every new post delivered to your Inbox.