I'd like to take a moment to talk about privacy preserving by default.
I don't intend for this to be a rant about current commercial decisions - instead I'd like it to be praise of what I think (and hope) is great design, and use it to try and set an example that other people can follow. I was talking with a friend recently, and he talked about how, ultimately, most people want personalization, they want ease of use, they want features from services and they're willing to give up their privacy in order to get those features. I don't disagree with either of his points - I agree with them entirely. But I challenge the assumption that getting those features, getting the ease of use requires giving up their data to a third party. And I instead pose the question: "If you can get the exact same feature, and it was provided in a privacy preserving way by say computing it locally on your phone - as opposed to bulk shipping your data out to a third party and having them analyze it on their servers. If you can get that exact same feature, I think everyone would prefer to get it in the privacy preserving way. So why not do that?"
Let's talk about a heart attack I had recently.
I use Evernote. Flame me, whatever, I'm not using it for work or for sensitive things, I'm using it for gift ideas I see in stores and simple things. If there was something I could run myself that had a shiny mobile app and a web UI, I'd use that but there isn't so let's move on. I took a photo, while I was at THREADS today, and when I went to add it in Evernote I got this screen:
Note, I recreated this with a quick shot of my laptop
How, in the hell, did it know I was at THREADS.
Was it doing some sort of geolocation combined with local events? I was so disturbed by this I searched for it: evernote smart title1. This lead me to a blog post announcing the feature.
Now, when you create a new note and save it without giving the note a title, the app will assign a contextual title using calendar events, your location, note contents, and other information.
This is a great example of a totally legitimate, useful feature that most people (including myself) would like. Without it, I'm going to have to type what is likely to be a redundant title (as I'm only putting a few words to remind myself what I took a photo of), or have the title remain 'Untitled'. But as someone somewhat concerned about my privacy, it also filled me with dread. I knew there was two ways this was likely to be implemented. One would be to read my location and calendar locally, and generate a title. The other would be to bulk-ship my data up to their servers, analyze, and send back a pregenerated title. Let's see which they do.
I don't really intend for this to be an Android App Reversing Walkthrough, but I do want to cover what I had to go through to figure this out, because it's really not that hard and I think the community should be doing more of this to answer questions like "Hey, how the heck does [Flavor of the Week 'Secure' Message App Work]?" So I'm going to skip the 'easy' parts, and dig into the more difficult reversing. I'll point to Intrepidus Group and your search engine for getting you past the part where you pull the APK off the device, and run it through dex2jar. At this point, we've got a pile of decompiled java files. Let's dig in.
$ grep -R title *
This yields 810 results. Way too broad, let's try another tactic.
$ grep -R "Picture from" *
This yields no results. This made me a little nervous, because if the title was generated locally, I'd expect that string fragment to be found somewhere. New tactic. This data came from my calendar, so let's look at calendar API calls. Searching for a few API calls, I found a folder called 'smart/noteworthy'. The feature is called 'smart titles' so this may be it. But before I spend a ton of time reading the code, I can do more 'quick, dirty, and coarse' approaches that may get me nothing, or may get me a jackpot.
In fact, I realized I was omitting a key debugging tool: running the application while tailing logs with adb logcat.
I/EN (26873): [NewNoteFragment] - canAttachFile()::mimeType=image/* I/EN (26873): [NewNoteFragment] - canAttachFile()result=true I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::7 D/EN (26873): [a] - generateAutoTitle()::title=null I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::5 D/EN (26873): [a] - starting events query+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ I/EN (26873): [a] - Attachment()::uri=content://media/external/images/media/1764 mMimeType=image/jpeg type=0::mTitle=null I/EN (26873): [a] - findType()::type=4 mimeType=image/jpeg I/EN (26873): [a] - isSupportedUri()true I/EN (26873): [a] - Attachment()::mType=4mTitle=IMG_20131114_184957 mMetainfo=1 Mb mMimeType=image/jpeg I/EN (26873): [a] - Attachment()::mTitle=IMG_20131114_184957 mMetainfo=1 Mb D/EN (26873): [a] - events=1 D/EN (26873): [ab] - isMultishotCameraAvailable: platform support = true Native library support = true I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::7 D/EN (26873): [a] - generateAutoTitle()::title=null I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::6 D/EN (26873): [NewNoteFragment] - getAddress-running I/ActivityManager( 1681): Displayed com.evernote/.note.composer.NewNoteAloneActivity: +373ms D/EN (26873): [ab] - isMultishotCameraAvailable: platform support = true Native library support = true I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::7 D/EN (26873): [a] - generateAutoTitle()::title=Picture from THREADS I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::7 D/EN (26873): [a] - generateAutoTitle()::title=Picture from THREADS I/EN (26873): [NewNoteFragment] - mHandler()::handleMessage()::7 D/EN (26873): [a] - generateAutoTitle()::title=Picture from THREADS @ Town, State D/EN (26873): [NewNoteFragment] - showHelpDialog()
Now that is what I'm looking for. I locate that method, and it has code fragments like:
localObject2 = paramContext.getString(2131165339); ... paramContext.getString(2131165687);
If you're a little familiar with Android, you probably realize this is something like context.getString(R.string.YOUR_STRING); but now it's been turned into a constant. Let's trace it down.
$ grep -R 2131165339 * jad-ed/com/evernote/android/multishotcamera/R$string.java: public static int untitled_note = 2131165339; jad-ed/com/evernote/note/composer/a.java: localObject2 = paramContext.getString(2131165339); jad-ed/com/evernote/note/composer/p.java: paramString = paramContext.getString(2131165339); jad-ed/com/evernote/provider/a.java: str2 = this.b.getString(2131165339); jad-ed/com/evernote/ui/NewNoteFragment.java: str1 = this.bl.getString(2131165339); jad-ed/com/evernote/ui/NewNoteFragment.java: str = b(2131165339); jad-ed/com/evernote/ui/QuickSaveFragment.java: this.bm = b(2131165339); $ grep -R untitled_note * Binary file com.evernote-1.apk matches jad-ed/com/evernote/android/multishotcamera/R$string.java: public static int untitled_note = 2131165339; Binary file unzipped/classes.dex matches Binary file unzipped/resources.arsc matches
Frankly, I'm still not sure why this was tracked down to the exact human-readable resource string - but generally speaking, our goal in Reverse Engineering is to stay as broad as we can until we have to go deep. I traced down more of these constants, inlined them, and followed the trail.
//These statements were not in this order, just placing them together for brevity localObject2 = paramContext.getString("auto_title_from_meeting_at_location", new Object[] { localObject2, str1, str2 }); localObject2 = paramContext.getString("untitled_note"); localObject2 = paramContext.getString("auto_title_from_meeting", new Object[] { localObject2, str1 }); localObject2 = paramContext.getString("auto_title_at_location", new Object[] { localObject2, str2 }); //Clearly str1 refers to the meeting name, and str2 the location //Where do they come from? str1 = b(); str2 = c(); private String b() //get meeting { if ((this.m != null) && (this.m.length > 0)) return this.m[0]; return null; } private String c() //get location { StringBuilder localStringBuilder = new StringBuilder(""); if (this.a != null) //this.a is "public Address a;" { String str1 = this.a.getLocality(); boolean bool = TextUtils.isEmpty(str1); int i1 = 0; if (!bool) { localStringBuilder.append(str1); i1 = 1; } String str2 = this.a.getAdminArea(); if ((!TextUtils.isEmpty(str2)) && (!str2.equalsIgnoreCase(str1))) { if (i1 != 0) localStringBuilder.append(", "); localStringBuilder.append(str2); } } return localStringBuilder.toString().trim(); } //Let's trace down this.m public final void b(Bundle paramBundle) { this.a = ((Address)paramBundle.getParcelable(q)); this.m = paramBundle.getStringArray(p); }
So we've figured out where the location-based part comes from. It's using a Geolocation API to grab that. Hunting down where the Meeting name came from is going to be much more difficult.
In fact, this is where I spent the bulk of my time. I did greps like grep -R ".b(" ../../, and when that was too coarse, grep -R "b(" ../../../../ | grep ";" | grep -v "," but I wasn't finding much. I decided to import it into Eclipse. Now clearly, this wasn't going to help me build it.
But I was hoping Eclipse would be able to build enough of it, and provide enough code navigation features to get a couple of hints out of it. And indeed, when I references all public calls of "b(Bundle paramBundle)", I did wind up with one:
Now at this point, I wasn't really getting much out of it. I read a lot of this code, and tried to figure out where things were going. I deciphered a lot more of the surrounding code, running down getString() calls and such. Like I said before - we stay broad until we have to go deep. I alternated between going deep, trying to outline what individual functions did while periodically stepping back and skimming the 2-3 surrounding classes.
Eventually, I was confused enough to take a step back. You see, I'm working with decompiled Java code - this is not what the original developers wrote. It's what a tool has translated from bytecode back into Java. It's a bit spaghetti-like, it's a bit wrong. In fact, one function was actually marked as it could not be decompiled:
// ERROR // private static String[] b(Context paramContext, String paramString) { // Byte code: // 0: aconst_null // 1: astore_2 // 2: aload_0
So with the experience that only comes from having done this before and understanding the limitations of one's tools, I stepped back even further. I needed to redo this decompilation. Fortunately, there are other Java decompilers out there. And using a second, I was able to get a successful decompilation of this previously undecompilable-b() function.
private static String[] b(Context context, String s1) { Cursor cursor; ContentResolver contentresolver; cursor = null; contentresolver = context.getContentResolver(); Cursor cursor2 = contentresolver.query(Uri.parse("content://com.android.calendar/calendars"), new String[] { "_id" }, s1, null, null); Cursor cursor1 = cursor2; if(cursor1 == null) goto _L2; else goto _L1 _L1: String as[] = new String[cursor1.getCount()]; int i1 = 0; _L5: if(!cursor1.moveToNext()) goto _L4; else goto _L3 _L3: as[i1] = cursor1.getString(0); i1++; goto _L5 _L2: if(cursor1 != null) cursor1.close(); as = null; _L7: return as;
This seems obvious in retrospect, but it's only by comparing the decompilations in detail that I saw just how wrong the first one was. Seemingly useless and unreachable code suddenly transformed into meaningful control flow statements. (Protip: compilers almost never emit unreachable code.) The calendar event clearly comes from the calendar, locally, on note creation.
Okay, let's step back again. I suspected Evernote might be doing something really cruddy like sending all my calendar events to their server so they can server these note titles. I'm fairly certain that is not the case. I have not reversed Evernote in its entirety and I am not saying they are not doing something very shady. They may well be. But for this single feature I looked at, I don't think they are.2.
But ultimately, I come back to the question I posed in the beginning: "If you can provide an awesome feature, and do it in a privacy preserving way, as opposed to a 'do the computation on our servers' approach, why not do that?" and I'll add, why not advertise that. In the age of legal liability for privacy violations and consumer interest in privacy, which is now even more compounded from Snowden - why not differentiate and advertise on technical constraints for privacy, in addition to making a sleek and awesome app and service? Tell people "Hey, we don't just take your privacy 'seriously' like everyone else, we provide our features on your phone so we never see the data."
Another great example of a complete start-up idea: geolocation based notes. I would love, and pay, for an App that let me put down groceries on a shopping list, and it'd remind me when I go in the grocery store. Let me put a marker "When I drive by this point in the road, at this time of day, remind me to pull over and put the clothes I've been trying to donate for a month in the donation bin."3 But all the apps I'm aware of that do this either don't work well, or send all your data (including location) to the server. This could run on the phone, there's no reason why it couldn't. I'll do your monetization strategy one better - sell me little bluetooth or NFC thingies I put at my front door, car, wallet, whatever. Let me make a note like "If I leave the house, remind me to grab the bills I need to mail" or "If I get in the car, and I don't have my wallet, freak out." Or go turn Paul Wouters's privacy preserving Google Latitude-like location-sharing into an app. There's a lot of ideas here.
I'll talk about one more example. RedPhone is an app that lets you make encrypted phone calls to people who also have RedPhone. But because it's annoying to have to manually choose to use RedPhone, plus the problem of knowing which of contacts have RedPhone to begin with - RedPhone will prompt you to upgrade your call to an encrypted call if the person you're calling also uses RedPhone. How does it know the other person uses RedPhone? Well, it could A) send all your contacts to the server and tell you which people have RedPhone4 B) send all the people who have RedPhone to you, or C) do something way sexier. What it does is send down a bloom filter that allows you to ask if any individual number has RedPhone, but doesn't give you the entire list of RedPhone users, nor send your contacts to the server.5
That's what I'd like to see more of. I'd like to see novel apps selling an innovative product that people want, not necessarily selling privacy - but still developed in a privacy-preserving way. I believe it can be done - call me naive but I believe you can build an awesome, innovative app that fills a niche and is privacy preserving - not privacy preserving as it's selling point with half-baked features added elsewhere6.
And also, to close up with Evernote, I think it'd be awesome if Evernote came out and confirmed that their Smart Title Feature, and their app in general, does not send all your contacts, calendar events, or anything up to their servers except the notes you create.
I recognize there are constraints to doing what I describe: battery life, computation speed, backgrounding, etc etc. I view these in the same vein as other engineering problems - they can be overcome with ingenuity, challenging assumptions, testing, and hard work.
1 Flame me again for using google, but DuckDuckgo's search results just aren't as good on this query.
2 I can't stress this enough. I don't know if Evernote is doing something scummy that I didn't uncover in the literally-one-hour I spent on this. Their privacy policy says things like "we automatically gather non-personally identifiable information that indicates your association with one of our business partners", "Any Content you add to your account", and "The geographic area where you use your computer and mobile devices (as indicated by an IP address or similar identifier)". However, it doesn't say "We take all your data."
3 I've had a bag in my car for over a month. Sell me this app, please.
4 I think, but am not sure, this is what SnapChat does.
5 I'm aware this design isn't perfect, but it's pretty good given the objectives and constraints.
6 While apps like Silent Circle and Wickr are coming close, I think apps should remember to build an awesome useable product first, and make the privacy preserving part supplemental as opposed to the primary selling point.
Shortly after posting this article, I got an email from a nice guy named Grant at Evernote, who gave me permission to post:
Hi Tom,
I read your blog post about privacy preserving applications and Evernote. I can confirm that the Smart Title feature, and the app in general, does not send all your contacts, calendar events, or anything else to Evernote servers except for the content of the notes you create.
You might find Evernote’s Three Laws of Data Protection interesting–specifically the second law: http://blog.evernote.com/blog/2011/03/24/evernotes-three-laws-of-data-protection/
Having an employee (Grant works as an Engineer, not in 'Public Relations') reach out to a random blog author is, in my opinion, a good sign of a straightforward and honest company. And I quite like the second law.
Everything you put into Evernote is private by default. We never look at it, analyze it, share it, use it to target ads, data mine it, etc.–unless you specifically ask us to do one of these things. Our business model does not depend on "monetizing" your data in any way. Rather, it depends on building trust and providing a great service that more and more people choose to pay for.
So props to Evernote. :)
required, hidden, gravatared
required, markdown enabled (help)
* item 2
* item 3
are treated like code:
if 1 * 2 < 3:
print "hello, world!"
are treated like code: