Sunday, April 8, 2012

Android Traffic Statistics Inside

If you're going to write an application for calculating/aggregating traffic statistics for Android device, you should be aware of some peculiarities and issues.

TrafficStats class

Of course, you will use TrafficStats class. It's available since API level 8 (Android 2.2), and provides bytes transmitted and received and network packets transmitted and received, over all interfaces, over the mobile interface, and on a per-UID basis.
Note, that you can't get stats, for example, separately for roaming - only total mobile traffic.
Per-UID stats merely returns total available stats for this UID, without separation by network interfaces.
Also there's no time information : if TrafficStats.getMobileRxBytes() returns 12345 bytes you never know if that's only today's traffic usage or last 10 days usage (explanation is below).

TrafficStats magic

If you checkout TrafficStats.java sources, you'll see that this class only calls native methods. And what does C do?
As you might know, Android's kernel is based on the Linux kernel. There's a sysfs virtual filesystem that exports information about devices and drivers, and also is used for configuration. Each network interface config is in a /sys/class/net/ dir. If you list that dir on a linux, you'll probably see something like that :

$ ls /sys/class/net/
eth0 eth1 eth2 lo


On my Android device there are :

$ ls sys/class/net
lo
dummy0
ifb0
ifb1
rmnet0
rmnet1
rmnet2
usb0
sit0
ip6tnl0
gannet0
tun
eth0


Of course, these lists can differ a little bit on another devices. The list is even changed while time's passing on the same device.

Anyway, we're interested in files :

/sys/class/net/[interface]/statistics/tx_packets
/sys/class/net/[interface]/statistics/rx_packets

/sys/class/net/[interface]/statistics/tx_bytes
/sys/class/net/[interface]/statistics/rx_bytes


These are the files that store traffic stats data. The contents of a file is actually just a number (type long). For example, to get your mobile received bytes traffic :

/sys/class/net/rmnet0/statistics/rx_bytes
or
/sys/class/net/ppp0/statistics/rx_bytes


That's exactly what native code does - it reads these files and returns the number stored in it.
To get total traffic it sums rx_bytes/tx_bytes/rx_packets/tx_packets for all interfaces under /sys/class/net/.
Important note : When you turn off some interface, for example wi-fi, its config dir disappears from /sys/class/net/ dir list and when you turn it on again - the dir appears and the stats is counted from the very beginning.

UIDs stats is taken from a proc file system, which is in fact a pseudo-file system, used as an interface to kernel data structures. On an Android device (so as on linux) it is mounted at /proc.
Sent/received UID traffic stats is here :

/proc/uid_stat/[uid]/tcp_snd
and
/proc/uid_stat/[uid]/tcp_rcv


In order to gather per-interface stats for a UID, you have to listen to Connectivity changes : register a broadcast receiver filtering android.net.conn.CONNECTIVITY_CHANGE action, and then save traffic values as corresponding interface traffic.
Important note : procfs is mounted at boot time, which means that every time your device is rebooted  there're 0 traffic values for all UIDs.
You can list /proc/uid_stat/ dir right now to see which UIDs have been spending traffic since last reboot.

As you can see, if you want to get a clear picture of your traffic consumption by time, merely calling TrafficStats methods is not enough. You have to catch connectivity changes in order to save applications' usage per interface, treat device reboots, and keep an eye on network interfaces configurations mounting/unmounting (sysfs).

UIDs stats

Let's dig deeper into UIDs stats, as it might be one of your traffic stats app key features - providing traffic usage per application.
UID (User ID) is unique for each application and stays constant as long as app is not reinstalled. (Well, an application can explicitly request to share a userid with another application, but there are security restrictions around this and that's an offtopic right now).
UIDs before 10000 are system reservered. Begining with 10000 and further UIDs are applications UIDs.
Here's a static list of system UIDs which will only possibly grow, but never change existing items:

0 - Root
1000 - System
1001 - Radio
1002 - Bluetooth
1003 - Graphics
1004 - Input
1005 - Audio
1006 - Camera
1007 - Log
1008 - Compass
1009 - Mount
1010 - Wi-Fi
1011 - ADB
1012 - Install
1013 - Media
1014 - DHCP
1015 - External Storage
1016 - VPN
1017 - Keystore
1018 - USB Devices
1019 - DRM
1020 - Available
1021 - GPS
1022 - deprecated
1023 - Internal Media Storage
1024 - MTP USB
1025 - NFC
1026 - DRM RPC


There's also UID 2000 standing for shell user. For example, if you take a device screenshot using ddms, you'll get 2000 UID's traffic increased by nearly the size of the picture.
In other words, to get system UIDs stats you should gather stats for all UID's before 2000.

While getting installed applications stats the first thought coming to your mind is usually to retrieve installed apps list with their UIDs and get stats for each of them:

PackageManager pm = getPackageManager();
List<ApplicationInfo> packages = pm.getInstalledApplications(PackageManager.GET_META_DATA);
for (ApplicationInfo packageInfo : packages) {
  ... // add packageInfo.uid to UIDs list
}


but this way can take much time (even for a couple of seconds), which is inefficient. Instead, you can grep /proc/uid_stat/ to get only those UIDs that actually have spent any traffic :

File dir = new File("/proc/uid_stat/");
String[] children = dir.list();
List<integer> uids = new ArrayList<integer>();
if (children != null) {
  for (int i = 0; i < children.length; i++) {
    int uid = Integer.parseInt(children[i]);
    if ((uid >= 0 && uid < 2000) || (uid >= 10000)) {
      uids.add(uid);
    }
  }
}


Moreover, using the way above instead of getting installed apps list will let you not loose unistalled apps traffic. Probably you won't show unistalled apps in UI, but you will show in totals that there's some extra traffic besides shown apps.

Always keep in mind, that some processes in a system use another processes to do some job. Thus the traffic that you assume is used by some app may be split between the processes which the app have used. When you'll examine your apps stats you might be really surprised. For example, viewing 10MB video with Youtube application adds 200KB as Youtube app traffic and 10MB as Media traffic (system UID 1013). After adding traffic stats tool in Android ICS, they really fixed that "feature" around Media traffic, but in previous APIs get ready for that surprise.

Doubling stats bug

There's another nice present for developers who's gonna write traffic statistics tool. Some Android 2.3 based systems collect DOUBLED per-UID stats. Exactly per-UID. Totals-methods (getTotalRxBytes(), getTotalTxBytes(), ...) always return correct data. You can use totals-methods in your app to automatically detect if current device has this bug present or not right after app installation : get total traffic value by calling (getTotalRxBytes() + getTotalTxBytes()) and by summing traffic used by all apps, do it twice with some delay between calculations; and then compare subtractions : if the bug is present, then your apps traffic growth will be nearly twice larger than totals-methods growth.

Connectivity change broadcasts receiving delay

As I've already mentioned, getting applications traffic stats per network interface requires extra moves. Assume you collect stats quite frequently (each 15 seconds, for example). While collecting you define current network type : 

final NetworkInfo activeNetworkInfo = connectivityManager.getActiveNetworkInfo();
if (activeNetworkInfo == null) {
  // we are not connected anywhere now
}
if ((activeNetworkInfo.getType() == ConnectivityManager.TYPE_WIFI) ||
    (activeNetworkInfo.getType() == ConnectivityManager.TYPE_WIMAX)) {
  // that's wifi
} else {
  // that's mobile
}


and save data into corresponding storage.
Assume, device's connected to a wifi hotspot, traffic data is successfully collected and saved. Then the connectivity changes : you receive broadcast about current network is disconnected, do what you need around this event (stop collecting, for example); and in a few moments you receive a new broadcast about new network (mobile in our case) is connected , and next time you collect and save your traffic data as mobile data. Everything seems to be OK. But, sometimes you can catch such situation : BEFORE you receive a broadcast "disconnected", your device in fact is ALREADY connected to a new network, and at that moment your handler collects stats and saves it to the wrong storage :

precondition: wifi's connected
1) collect wifi traffic
2) collect wifi traffic
3) collect wifi traffic // that's all OK
4) wifi's actually disconnected
5) receive broadcast telling wifi's disconnected // do something around this
6) mobile's actually connected
7) receive broadcast telling mobile's connected // do something around this
8) collect mobile traffic
9) collect mobile traffic // that's all OK
10) mobile's actually disconnected
11) wifi's actually connected // !!!! wifi is connected before your application
                                             // gets mobile-disconnected broadcast
12) collect wifi traffic // because current network type is already wifi
                                  // while mobile traffic is expected
13) receive broadcast telling mobile's disconnected // finally... :(
14) receive broadcast telling wifi's connected

Depending on your data gathering algorithm you might loose (or duplicate) some data because of that broadcasts receiving delay.


These are all the hazards you should be aware of before writing traffic stats application on Android.