WURFL Schema

This document explains how is the WURFL data organized internally.

WURFL contains tens of thousands of devices, each one with hundreds of properties. Collectively, WURFL represents a large virtual matrix of devices and capabilities. Each device often comes with several subversions (often corresponding to different version of the firmware) that are hard to model.

This is the area where WURFL is smarter than other solutions:

  • Browsers are different, but they also have many features in common with one another.
  • Browsers/devices coming from the same manufacturer are most often an evolution of the same hardware/software. In other words, differences between, say, a iPhone X and a iPhone XS are minimal.
  • Devices from different manufacturers may run the same software. For example, the Android OS runs on devices from Samsung, Sony, Huawei, Xiaomi, OPPO, Motorola, Nokia and others.

Exploiting these assumption allows the WURFL to:

  • Represent device data in very compact ways.
  • Make the repository update process simple.

WURFL is based on the concept of family of devices. All devices are descendent of a generic device, but they may also descend of more specialized families. A device which identifies itself as (user-agent) Mozilla/5.0 (Android 11; Mobile; rv:82.0) Gecko/82.0 Firefox/82.0 is an implementation of the browser by Mozilla and, of course, also a descendent of the Generic Android 11. As a consequence, as soon as such a device is released (or, we should say, as soon as ScientiaMobile detects its user agent hitting a site), we can safely add it to the WURFL and state that it is a descendent of the "Firefox" family. This will let that phone inherit all of the capabilities of the family of the Firefox browser even before that device is actually tested by anyone.

This mechanism, called 'fall_back', lets programmers derive the capabilities of a given phone by looking at the capabilities of its family, unless a certain feature is specifically different for that phone. To further clarify, here is a concrete example. Samsung shipped several subversion of the Galaxy S20 5G (SM-G988U, SM-G988U1, SM-G988W etc.). The WURFL models this knowledge elegantly thanks to the fall_back mechanism. First, the Generic Android family specifies a capability called "model_name":

wurfl.xml:

<device fall_back="root" id="generic" user_agent="">
 <group id="ui">
    :
   <capability name="table_support" value="true" />
 </group>

you can read this as "Generic devices do not have a model name" As a WURFL default, Android phones have a model name of their Android version. This is modeled here:

<device user_agent="DO_NOT_MATCH_GENERIC_ANDROID_10_0 " fall_back= "generic_android_ver9_0" id="generic_android_ver10_0 ">
 <group id="product_info">
  <capability
      name="model_name"
      value="Android 10.0" />
 </group>
</device>

When it comes to model_name for the Galaxy S20 5G properties for the device need to be overwritten from the generic android ID:

 <device user_agent="Mozilla/5.0 (Linux; Android 10; SM-G981U) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.119 Mobile Safari/537.36"
 fall_back="generic_android_ver10_0 "
         id="samsung_sm_g981u_ver1">
    :
 <group id="product_info">
      :
   <capability name="model_name" value="SM-G981U" />
  <capability name="marketing_name" value="Galaxy S20 5G" />
 </group>
 </device>

 <device user_agent="Mozilla/5.0 (Linux; Android 10; SM-G981U1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Mobile Safari/537.36"
         fall_back="samsung_sm_g981u_ver1"
         id="samsung_sm_g981u_ver1_subuau1">
    :
  <group id="product_info">
      :
   <capability name="model_name" value="SM-G981U1" />
<device user_agent="Mozilla/5.0 (Linux; Android 10; SM-G981W) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Mobile Safari/537.36"
         fall_back="samsung_sm_g981u_ver1"
         id="samsung_sm_g981u_ver1_subuaw">
    :
  <group id="product_info">
      :
   <capability name="model_name" value="SM-G981W" />

When it comes to model name, this can be read as "there is a family of phones subset of the Galaxy S20 5G for which model_names are different among devices". All known Galaxy S20 5G devices “fall back” on a single device and inherit properties with the model_name being different among device profiles.

The WURFL Device Hierarchy

If you are looking into the wurfl.xml file, the number one concept you should be familiar with is the fall-back hierarchy. The hierarchy allows new devices to inherit their capability values from similar devices from the same manufacturers. The fall-back mechanism is powerful and has great advantages, since a correct choice of fall-back yields a high-chance that the value of capabilities is inferred correctly. ScientiaMobile makes sure that values specific to each devices are overridden in the profile for the speicifc device. This mechanism allows WURFL to identify very sensible defaults even in the case of unlisted devices (i.e. devices that don't have a specific profile in WURFL yet) for which browser and OS version can still be determined.

WURFL XML Structure and Functions

In order to better explain the concepts and the functions introduced with the new system a basic introduction to the structure of the WURFL XML is provided here. The WURFL XML file is basically a flat list of <device> elements, albeit the fall_back mechanism allows WURFL users to regard it as a logical tree, in which elements have different types, as illustrated below (also see picture):

Root (generic)

root (also known as "the generic element") represents the capability of unrecognized HTTP clients. Generic has some special properties: it contains all WURFL capabilities, albeit always set with very conservative values (it is not wise to make assumptions about unrecognized HTTP clients). This element can be overridden to set values for unrecognized HTTP requests (for example, some may want WURFL to assume that an unrecognized request comes from a web browser, and not a mobile device).

Family

A family is a <device> element that does not represent any specific device, yet its existence is useful to collect the value of capabilities that are common to the devices (or sub-families) falling-back into the family. Nokia Series 40 is a great example of that.

Actual Device Root

A device marked as 'actual device root' represents an actual device which happens to have been elected as the representative of the (possibly few, possibly many) devices by the same name but potentially slightly different set of features. An example of this might be the Galaxy S20 5G, a popular device that comes in many very similar variations (the version made for Verizon may come with different pre-loaded apps, but is essentially the same phone).

Device Subversion

Finally, a device may represent a device subversion, i.e. a device which is in principle very similar to the some existing "actual device" (see above) and which has been inserted for either capturing the delta of difference with the actual device, or simply to help the UA-String matching heuristics get to the right device when a HTTP request comes in.


WURFL Schema
Diagram: The fall-back hierarchy