browsertime: live applink coldload, Chrome 74 vs GVE 68 vs Fenix 68 vs Fennec 64 on the Pixel 2 and Moto G5

tl;dr

This report summarizes data for the applink scenario: tapping a link in a third-party App (say, your mail client) and measuring the time that it takes Fenix to start, render the page, and fire the load event. The times reported capture the time for the engine (GeckoView or Blink) to start and to complete pageload.

A Quantum Flow-like startup performance effort to speed up GeckoView startup is needed. Chrome starts navigating in less than 0.5s, and so can we!
The particular site makes almost no difference to the engine startup overhead , meaning that a few (or even one) representative site(s) is sufficient to measure and track regressions in the applink load time. That will reduce the automation cost for regression testing.

Geometric means

The applink geometric means show that Fenix 68 takes roughly half as long as Fennec 64 to get to the load event, but takes more than one-and-a-half times as long as Chrome 74:

Here are the applink geomeans:

Methodology

Test harness

The data were collected using an ad-hoc Python harness driving the browsertime testing suite. The harness is still under development but will eventually be published and development is tracked by Bug 1545627.

browsertime drives the underlying vehicles using Web Driver automation; for WebView this means chromedriver driving the engine via the Chrome Debug Protocol and for GeckoView this means geckodriver driving the engine over the Marionette protocol.

The version of browsertime used was lightly modified to support Android-specific WebView engine configuration and to support the GeckoView engine. None of these modifications are believed to impact engine performance.

The version of geckodriver was heavily modified to support the GeckoView engine over the adb TCP/IP protocol. These modifications principally concern launching the target vehicle and connecting to the underlying protocol handler; any impact on engine performance has to do with servicing the underlying protocol and ambient engine configuration (for example, custom profiles in GeckoView).

The version of chromedriver was manually patched to support measuring applinks. This required making two strings be empty in order to change the am start ... invocation and in order to exploit a further unsanitized string substitution. A version of chromedriver could have been compiled for this purpose, but the Chromium build system is not easy to work with and this approach was surprisingly simple.

Vehicles tested

The data were collected from the following vehicle configurations:

vehicle	engine	Tracking Protection
Fenix 68	GeckoView	enabled
geckoview_example 68 (GVE)	GeckoView	disabled
Fennec 64	GeckoView	N/A
Chrome 74	Blink	N/A

Single site test

For each site, the four vehicle configurations were tested using a customized version of browsertime that starts the vehicle with an Android Intent with -a android.intent.action.VIEW -d $URL and that measures the process start time (in milliseconds after the epoch) and the time that the engine reports starting the top-level navigation (in milliseconds after the epoch).

The process start time in milliseconds after the epoch was determined by inspecting /proc/${PID}/stat for the relevant process ID. There is some noise in this measurement (generally less than 10ms) so three readings were averaged and rounded to the nearest millisecond. The calculations can be found (as browsertime modifications) here.

The engine’s navigation start time origin is commensurable with the process start time.

browsertime reports a wide range of timings, mostly from the Performance Navigation Timing API, and the customized version of browsertime produces these as timings relative to the process start time; see here.

Schematically:

am start ... -a android.intent.action.VIEW $URL
/proc/${PID}/stat to get processStartTime
browsertime to get window.performance.timeOrigin and navigationStartTime
Calculations to determine time to applink coldload: navigationStartTime - processStartTime + loadEventStart

Sites tested

The sites were taken from the product mobile corpus were tested. The sites not tested were:

site	reason
https://www.allrecipes.com/	live load failures
https://www.allrecipes.com/recipe/16485/barbs-broccoli-cauliflower-salad/?internalSource=hub%20recipe&referringContentType=Search	live load failures

Some sites witnessed transient network errors: in these cases the number of recorded measurements is fewer than expected. In any individual run, no site was measured fewer than 4 times.

The entire corpus was tested end-to-end twice in succession on a single Pixel 2 and a single Moto G5.

Results

Fenix 68 applink is significantly faster than Fennec 64

Fenix 68 applink is significantly faster than Fennec 64, but still lags Chrome. Ignoring GVE, the following graph gives insight into how the vehicles compare on the sites in the test corpus for each run with live sites:

GeckoView takes longer to start than Chrome; all engines are consistent across sites

Within a particular vehicle on a particular device, the time from process start until the underlying engine reports that the applink top-level navigation has started is remarkably consistent across sites. This time is essentially the “engine initialization time”, and it’s clear that Gecko is significantly slower than Chrome.

At this time, it is not clear why Fenix is faster than GVE to navigationStartTime.

Fenix 68 vs Fennec 64

We see a nice win for Fenix 68 vs Fennec 64 across the board.

Fenix 68 vs Chrome 74

Fenix 68 lags Chrome 74 across the board. The startup overhead is so bad that the set of sites where tracking protection dramatically improves pageload time (as measured by firing the load event) is greatly reduced: it’s only the worst offenders – the buzzfeeds of the world – where tracking protection can still shine.

Fenix 68 vs GVE 68

Fenix 68 vs GVE 68 is a good proxy for the impact of tracking protection.

Inter-vehicle reliability

Network weather and live site differences

No record and replay proxy was used to minimize the impact of network weather.

Some of the sites serve dynamic content and/or advertisements. This means that between individual tests, the underlying network archives may have changed significantly.

Gecko profile conditioning

It is well known that the Gecko profile significantly impacts the performance of the Gecko engine: preferences, certificate databases, and the network cache itself can have major impacts on measurements.

To minimize volatility, for each GeckoView-based vehicle configuration, a Gecko profile was conditioned as follows:

A profile template (with cert9.db and key4.db containing the custom CA certificate used by proxies) was produced.
This profile template was copied to the target device, and the vehicle was started from a cleared state with this profile.
The single page https://example.com was visited.
The vehicle was left idle for 2 minutes.
The vehicle was force-stopped and the conditioned profile retrieved from the device.

This conditioned profile was then copied to the device at the beginning of every test run: that is, every applink cold pageload started with exactly the same Gecko profile.

The code that implements this profile conditioning can be found here.

Versions

Device versions

The output of adb shell getprop for each device is available:

device
Pixel 2
Moto G5

Software versions

package	version (* denotes modified)	link
browsertime	4.7.0 (*)	https://github.com/ncalexan/browsertime/tarball/4f8738e2225b3298556308c0d8c916d3b22e9741
chromedriver	2.46 (*)	https://chromedriver.storage.googleapis.com/index.html?path=2.46/
geckodriver	0.40.0 (*)	https://hg.mozilla.org/users/nalexander_mozilla.com/gecko/rev/e7f1d26dec97a4a26c61b8e18dfe769d4c11e096
Fenix	1.0.1917	https://queue.taskcluster.net/v1/task/TujyjjJqSoqeSVykLTjdmA/artifacts/public%2Ftarget.apk
geckoview_example	N/A	https://queue.taskcluster.net/v1/task/EbHvScwDQBWnoYHJRwZphQ/artifacts/public%2Fbuild%2Fgeckoview_example.apk
Fennec	64.0.2	http://archive.mozilla.org/pub/mobile/releases/64.0.2/android-api-16/multi/fennec-64.0.2.multi.android-arm.apk
Chrome	74.0.3729.112	N/A

Data

Raw data

The data collected for the following tests is available:

run	device	folder
1	Moto G5	https://drive.google.com/open?id=19tYjLirGCTWzyp75RxC3W68Ftf0rhdd0
2	Moto G5	https://drive.google.com/open?id=10mNf9yeqT_apryjjQg7bCEjdNPzcjV8s
3	Pixel 2	https://drive.google.com/open?id=1EMYcH9eswZikgkhRpcm124__mWISkGZL
4	Pixel 2	https://drive.google.com/open?id=1IZjT2stIJNyCIPVZhWKxXMXnYsehkIxw

Processed data

The data from the test runs above can be found in the following CSV file. The columns of the CSV are as follows:

column	description
`device`	the target device, one of “Pixel 2”, “Moto G5”
`run`	the test run number
`site`	the URL of the page being loaded
`engine`	the tested engine’s User Agent string
`proxy`	“live” to signify the site was loaded from the live network
`timestamp`	the local timestamp when the pageload was initiated
`navigationStartTime`	the `window.performance.timeOrigin` reported by the engine under test, measured from the main process start time
`pageLoadTime`	the `loadEventStart` timestamp reported by the engine under test, measured from the main process start time