Responsive Web Typography with WebRTC

I love where emerging web technologies — such as WebRTC (Web Real-Time Communication) and WebAPI — are headed, because they make it possible to use various parts of hardware that already exist inside our computers, tablets and smartphones to improve the user experience. Responsive Typography with WebRTC is yet another example of a simple concept that could improve people’s experiences.

Scratch your itch

Ever since the introduction of media queries and the outbreak of responsive web design, it has bothered me somewhat that current responsive web design methods are based on media queries that solely test the width and height of the viewport (alright, and pixel density too) and that we always have to make assumptions about the rest of the context.

Even though device manufacturers try to follow the reference pixel “treaty”, the inconsistencies (both occasional and severe, depending on the range of the devices you are developing for) can be frustrating and developers have to construct special media queries to get around the problem. By defining the reference pixel sizes for their devices, manufacturers consequentially impel people to use each device in a certain way.

In reality, however, people use their devices at different distances. As of yet, there is no clause in the “device purchase agreement” that would bound the new owner to use the device only at a certain reading distance.

There is an array of natural distances, such as the wrist, palm, lap, desk, wall (and mall) distance and devices are already used across those multiple distances, regardless of their form factor. I’ve been pondering that problem every once in a while over the last few years and I came to the conclusion that in order to achieve the perfect experience, we’d have to make the device aware of the user’s exact needs and then adjust the screen to match any given reader-device relationship.

The idea to track proximity between the user and the screen has probably lingered in the minds of many people in the industry. I know that Mark Boulton has been advocating the idea of introducing sensors for better responsive experience and Tim Brown gave an excellent talk on Universal Typography at InspireConf in Leiden last fall.

Essentially, I have been of the opinion that we need to start using devices to passively collect information, and I have always kept my eyes on the device camera as the most probable component, but never found anything remotely tangible. That is until recently, after I learnt about WebRTC and getUserMedia.

WebRTC to the rescue

The WebRTC standard and getUserMedia API haven’t been invented for detecting reading distances — at least according to their names. However, the possibility to capture the user in front of the device with getUserMedia appeared to be the missing link that could turn the idea into a working concept. I thought that if I could use the capture and manipulate it with JavaScript, calculating the reading distance would be the easy part. At least until the real proximity events become widely supported.

Good people of the Internet

By the time I started learning about getUserMedia API, a few developers had already developed their own solutions for face recognition and/or head tracking and posted them on Github or their personal websites. After some brief testing, I picked out Headtrackr developed by Audun Mathias Øygard, because it already had everything I needed built in.

Headtrackr can return the width and height of the rectangle around a recognized face as well as the head distance from the screen. The later was less accurate when I first tested it, so I have sticked to the face recognition part only. In short, I divided the “face width” by the video width to get the face to canvas ratio, which has been used either as a multiplier for the root element font size, or as a simple breakpoint query for the respective stylesheet. Have a look at the Responsive Typography demo for different applications.

The Headtrackr code is well explained, but everything in this demo is so ridiculously simple and intuitive that you don’t even need to read the documentation to understand how to use it. For the purpose of the demo, I’ve created three custom, yet fairly simple functions that use the information gathered via Headtrackr to manipulate CSS (we’ll use just two of them in this walkthrough).

The tracking performance is far from perfect at this stage and this is not production grade code. However, I hope that it’s going to spark your imagination and give you a clue about what the future brings.

DIY responsive typography

For the sake of clarity, the code presented in this article is the simplest version, without the green rectangle that follows the face (whilst you are rocking back and forth in front of your web cam) and without the cool looking, but completely optional parameters, updated every 50 milliseconds.

First, the HTML part:

  • video element which is used for the stream
  • canvas element where the magic happens (actually, where the video frames are copied into)
<canvas id="compare" width="320" height="240" style="display:none" />
<video id="video" autoplay loop width="320" height="240" />

Second, the JavaScript part where some variables are set and Headtrackr is initialized:

var d = document,
    videoInput = d.getElementById('video'),
    canvasInput = d.getElementById('compare');
 
var htracker = new headtrackr.Tracker({
        altVideo : {
            ogv: "./media/capture5.ogv", 
            mp4: "./media/capture5.mp4"
        }, 
        calcAngles: true, 
        ui: false, 
        headPosition: false, 
        debug: false
    });
htracker.init(videoInput, canvasInput);
htracker.start();

So far, so good.

The first function, updateFontSize, is — wouldn’t you know — updating the HTML element’s font size. It could update any other element’s font size too, but if you are already familiar with em-based media queries, then controlling the HTML element’s font size makes perfect sense, because it corresponds to how the browser built-in text zooming works (with Ctrl/Cmd + +/-).

The updateFontSize function receives only one argument, the facetrackingEvent event and we only need the width property.

function updateFontSize(ev) {
    var faceWidth = ev.width,
        videoWidth = videoInput.width,
        face2canvasRatio = videoWidth/faceWidth,
        rootSize = Math.round(face2canvasRatio*10)/10 - 1.5 + 10 + 'px';
    d.getElementsByTagName('html')[0].style.fontSize = rootSize;
}

You have probably noticed that I’ve done some “manual normalizing” to get the rootSize based on the face2canvasRatio just right. This is far from optimal, but it translates face2canvasRatio to an integer adequate enough to be used as a reasonable font size value.

The second function — again without a particularly inventive name — breakPointClass sets the class name on the BODY element depending on the face2canvasRatio value. These breakpoints are determined empirically by observing and tweaking, so feel free to use your own.

function breakPointClass(ev) {
    var b = d.getElementsByTagName('body')[0],
        faceWidth = ev.width,
        videoWidth = videoInput.width,
        face2canvasRatio = videoWidth/faceWidth; 
    if (face2canvasRatio &gt; 3.2) {
        b.className = 'far';
    }
    if (face2canvasRatio &lt; 2.2) {
        b.className = 'close';
    }
    if (face2canvasRatio &gt;= 2.2 &amp;&amp; face2canvasRatio &lt;= 3.2) {
        b.className = '';
    } 
}

Finally, the event listener for the facetrackingEvent event passes the event object (and its properties) to the function of choice:

d.addEventListener('facetrackingEvent', function(event) {
    updateFontSize(event);
});
// or
d.addEventListener('facetrackingEvent', function(event) {
    breakPointClass(event);
});

In case you only need the initial value, you can stop the tracker and pause the video streaming to offload the processor:

d.addEventListener('facetrackingEvent', function(event) {
    htracker.stop();
    updateFontSize(event);
    videoInput.pause();
});

And that’s it! You’ve done it. Take a look at the complete example.

Now that you know how easy it is to use the built-in hardware, you can start bringing your own ideas to life.

Unleash the Dragon

It’s obvious that we can use technology to improve people’s lives more directly, besides artificially filling the cracks in our broken social patterns (no pun intended). Apart from reading ergonomics, there are many other areas that are barely scratched or that are simply based on proprietary technologies.

For example, we could build and use sensors to monitor heart rate or blood pressure, collect such data on the fly and upload it to the patient’s file on the doctor’s computer. We can use the device’s accelerometer not only to protect the hard drive in the laptop that’s about to hit the floor, but to also alert an impaired person’s caretaker in case of a sudden collapse.

We already use applications like Runkeeper to collect data from our running sessions, why not couple it with the air pressure on that particular route on that particular day, for a more comprehensive dataset? The mere awareness that there are external factors that influence how we perform on a day to day basis, could lessen everyday frustrations and lead to a happier and healthier life.

The ideas are endless and with emerging technologies like Firefox OS and Geeksphone we will soon be able to access all sensors built in those tiny devices we all carry around anyway. And by that time, we will have no excuses to not develop new concepts that can make our lives a tiny bit better.

View full post on Mozilla Hacks – the Web developer blog

Leave a Reply