Snoopy is a distributed tracking
and profiling framework to perform some pretty interesting tracking and
profiling of mobile users through the use of WiFi. The talk was well received
(going on what people said afterwards) by those attending the conference and it
was great to see so many others as excited about this as we have been.
In addition to the research, we both took a different
approach to the presentation itself. A 'no bullet points' approach was decided
upon, so the slides themselves won't be that revealing. Using Steve Jobs as our
inspiration, we wanted to bring back the fun to technical conferences, and our
presentation hopefully represented that. As I type this, I have been reliably
informed that the DVD, and subsequent videos of the talk, is being mastered and
will be ready shortly. Once we have it, we will update this blog post. In the
meantime, below is a description of the project.
Background
There have been recent initiatives from numerous
governments to legalise the monitoring of citizens' Internet based
communications (web sites visited, emails, social media) under the guise of
anti-terrorism. Several private organisations have developed technologies
claiming to facilitate the analysis of collected data with the goal of
identifying undesirable activities. Whether such technologies are used to
identify such activities, or rather to profile all citizens, is open to debate.
Budgets, technical resources, and PhD level staff are plentiful in this sphere.
Snoopy
The above inspired the goal of the Snoopy project:
with the limited time and resources of a few technical minds could we create
our own distributed tracking and data interception framework with functionality
for simple analysis of collected data? Rather than terrorist-hunting, we would
perform simple tracking and real-time + historical profiling of devices and the
people who own them. It is perhaps worth mentioning at this point that Snoopy
is compromised of various existing technologies combined into one distributed
framework.
"Snoopy is a distributed tracking and profiling
framework."
Below is a diagram of the Snoopy architecture, which
I'll elaborate on:
Snoopy runs client side code on any Linux device that has
support for wireless monitor mode / packet injection. We call these
"drones" due to their optimal nature of being small, inconspicuous,
and disposable. Examples of drones we used include the Nokia N900, Alfa R36 router, Sheeva plug,
and the RaspberryPi.
Numerous drones can be deployed over an area (say 50 all over London) and each
device will upload its data to a central server.
2. WiFi?
A large number
of people leave their WiFi on. Even security savvy folk; for example at
BlackHat I observed >5,000 devices with their WiFi on. As per the RFC
documentation (i.e. not down to individual vendors) client devices send out 'probe requests' looking
for networks that the devices have previously connected to (and the user chose
to save). The reason for this appears to be two fold; (i) to find hidden APs
(not broadcasting beacons) and (ii) to aid quick transition when moving between
APs with the same name (e.g. if you have 50 APs in your organisation with the
same name). Fire up a terminal and bang out this command to see these probe
requests:
tshark -n -i mon0 subtype probereq
(where mon0 is
your wireless device, in monitor mode)
2. Tracking?
Each Snoopy drone collects every observed probe-request,
and uploads it to a central server (timestamp, client MAC, SSID, GPS
coordinates, and signal strength). On the server side client observations are
grouped into 'proximity sessions' - i.e device 00:11:22:33:44:55 was sending
probes from 11:15 until 11:45, and therefore we can infer was within proximity
to that particular drone during that time.
We now know that this device (and therefore its human)
were at a certain location at a certain time. Given enough monitoring stations
running over enough time, we can track devices/humans based on this
information.
3. Passive Profiling?
We can profile device owners via the network SSIDs in the
captured probe requests. This can be done in two ways; simple analysis, and
geo-locating.
Simple analysis could be along the lines of "Hmm,
you've previously connected to hooters, mcdonalds_wifi, and
elCheapoAirlines_wifi - you must be an average Joe" vs "Hmm, you've
previously connected to "BA_firstclass, ExpensiveResataurant_wifi, etc -
you must be a high roller".
Of more
interest, we can potentially geo-locate network SSIDs to GPS coordinates via
services like Wigle (whose database
is populated via wardriving),
and then from GPS coordinates to street address and street view photographs via
Google. What's interesting here is that as security folk we've been telling
users for years that picking unique SSIDs when using WPA[2] is a "good
thing" because the SSID is used as a salt.
A side-effect of this is that geo-locating your unique networks becomes much
easier. Also, we can typically instantly tell where you work and where you live
based on the network name (e.g BTBusinessHub-AB12 vs BTHomeHub-FG12).
The result - you walk past a drone, and I get a street
view photograph of where you live, work and play.
4. Rogue Access Points, Data
Interception, MITM attacks?
Snoopy drones have
the ability to bring up rogue access points. That is to say, if your device is
probing for "Starbucks", we'll pretend to be Starbucks, and your
device will connect. This is not new, and dates back to Karma in 2005. The
attack may have been ahead of its time, due to the far fewer number of wireless
devices. Given that every man and his dog now has a WiFi enabled smartphone the
attack is much more relevant.
Snoopy
differentiates itself with its rogue access points in the way data is routed.
Your typicalPineapple, Silica, or various
other products store all intercepted data locally, and mangles data locally
too. Snoopy drones route all traffic via an OpenVPN connection to a central
server. This has several implications:
(i) We can
observe traffic from all drones
in the field at one point on the server. (ii) Any traffic manipulation needs
only be done on the server, and not once per drone. (iii) Since each Drone
hands out its own DHCP range, when observing network traffic on the server we
see the source IP address of the connected clients (resulting in a unique
mapping of MAC <-> IP <-> network traffic). (iv) Due to the nature
of the connection, the server can directly access the client devices. We could
therefore run nmap, Metasploit, etc directly from the server, targeting the
client devices. This is a much more desirable approach as compared to running
such 'heavy' software on the Drone (like the Pineapple, pr Pwnphone/plug
would). (v) Due to the Drone not storing data or malicious tools locally, there
is little harm if the device is stolen, or captured by an adversary. ->->
On the Snoopy server, the following is deployed with
respect to web traffic:
(i) Transparent Squid server - logs IP, websites,
domains, and cookies to a database (ii) sslstrip - transparently hijacks HTTP
traffic and prevent http upgrade by watching for http links and redirecting. It
then maps those links into either look-alike HTTP links or homograph-similar
http links. All credentials are logged to the database (thanks Ian &
Junaid). (iii) mitmproxy.py - allows for arbitary code injection, as well as
the use of self-signed SSL certificates. By default we inject some JavaScipt
which profiles the browser to discern the browser version, what plugins are
installed, etc (thanks Willem).
Additionally, a traffic analysis component extracts and
reassembles files. e.g. PDFs, VOiP calls, etc. (thanks Ian).
5. Higher Level Profiling?
Given that we can intercept network traffic (and
have clients' cookies/credentials/browsing habbits/etc) we can extract useful
information via social media APIs. For example, we could retrieve all Facebook
friends, or Twitter followers.
6. Data Visualization and Exploration?
Snoopy has two interfaces on the server; a web
interface (thanks Walter), and Maltego transforms.
-Maltego Maltego Radium has recently been released; and it is one awesome piece of kit for data exploration and visualisation.What's great about the Radium release is that you can combine multiple transforms together into 'machines'. A few example transformations were created, to demonstrate:
- Devices Observed at both 44Con and BlackHat Vegas Here we depict devices that were observed at both 44Con and BlackHat Las Vegas, as well as the SSIDs they probed for.
2. Devices at 44Con, pruned Here we look at all devices and the SSIDs they probed for at 44Con. The pruning consisted of removing all SSIDs that only one client was looking for, or those for which more than 20 were probing for. This could reveal 'relationship' SSIDs. For example, if several people from the same company were attending- they could all be looking for their work SSID. In this case, we noticed the '44Con crew' network being quite popular. To further illustrate Snoopy we 'targeted' these poor chaps- figuring out where they live, as well as their Facebook friends (pulled from intercepted network traffic*).
Snoopy Field Experiment
We collected broadcast probe requests to create two main datasets. I collected data at BlackHat Vegas, and four of us sat in various London underground stations with Snoopy drones running for 2 hours. Furthermore, I sat at King's Cross station for 13 hours (!?) collecting data. Of course it may have made more sense to just deploy an unattended Sheeva plug, or hide a device with a large battery pack - but that could've resulted in trouble with the law (if spotted on CCTV). I present several graphs depicting the outcome from these trials:
The pi chart below depicts the proportion of observed devices per vendor, from the total sample of 77,498 devices. It is interesting to see Apple's dominance. pi_chart
The barchart below depicts the average number of broadcast SSIDs from a random sample of 100 devices per vendor (standard deviation bards need to be added - it was quite a spread).
The barchart below depicts my day sitting at King's Cross station. The horizontal axis depicts chunks of time per hour, and the vertical access number of unique device observations. We clearly see the rush hours.
Potential Use
What could be done with Snoopy? There are likely legal, borderline, and illegal activities. Such is the case with any technology.
Legal -Collecting anonymized statistics on thoroughfare. For example, Transport for London could deploy these devices at every London underground to get statistics on peak human traffic. This would allow them to deploy more staff, or open more pathways, etc. Such data over the period of months and years would likely be of use for future planning. -Penetration testers targeting clients to demonstrate the WiFi threat.
Borderline -This type of technology could likely appeal to advertisers. For example, a reseller of a certain brand of jeans may note that persons who prefer certain technologies (e.g. Apple) frequent certain locations. -Companies could deploy Drones in one of each of their establishments (supermarkets, nightclubs, etc) to monitor user preference. E.g. a observing a migration of customers from one establishment to another after the deployment of certain incentives (e.g. promotions, new layout). -Imagine the Government deploying hundreds of Drones all over a city, and then having field agents with mobile Drones in their pockets. This could be a novel way to track down or follow criminals. The other side of the coin of course being that they track all of us...
Illegal -Let's pretend we want to target David Beckham. We could attend several public events at which David is attending (Drone in pocket), ensuring we are within reasonable proximity to him. We would then look for overlap of commonly observed devices over time at all of these functions. Once we get down to one device observed via this intersection, we could assume the device belongs to David. Perhaps at this point we could bring up a rogue access point that only targets his device, and proceed maliciously from there. Or just satisfy ourselves by geolocating places he frequents. -Botnet infections, malware distribution. That doesn't sound very nice. Snoopy drones could be used to infect users' devices, either by injection malicious web traffic, or firing exploits from the Snoopy server at devices. -Unsolicited advertising. Imagine browsing the web, and an unscrupulous 3rd party injects viagra adverts at the top of every visited page?
Similar tools
Snoopy in the Press
FAQ
Q. But I use WPA2 at home, you can't hack me! A. True - if I pretend to be a WPA[2] network association it will fail. However, I bet your device is probing for at least one open network, and when I pretend to be that one I'll get you.
Q. I use Apple/Android/Foobar - I'm safe! A. This attack is not dependent on device/manufacture. It's a function of the WiFi specification. The vast majority of observed devices were in fact Apple (>75%).
Q. How can I protect myself? A. Turn off your WiFi when you l leave home/work. Be cautions about using it in public places too - especially on open networks (like Starbucks). A. On Android and on your desktop/laptop you can selectively remove SSIDs from your saved list. As for iPhones there doesn't seem to be option - please correct me if I'm wrong? A. It'd be great to write an application for iPhone/Android that turns off probe-requests, and will only send them if a beacon from a known network name is received.
Q. Your research is dated and has been done before! A. Some of the individual components, perhaps. Having them strung together in our distributed configuration is new (AFAIK). Also, some original ideas where unfortunately published first; as often happens with these things.
Q. But I turn off WiFi, you'll never get me! A. It was interesting to note how many people actually leave WiFi on. e.g. 30,000 people at a single London station during one day. WiFi is only one avenue of attack, look out for the next release using Bluetooth, GSM, NFC, etc :P
Q. You're doing illegal things and you're going to jail! A. As mentioned earlier, the broadcast nature of probe-requests means no laws (in the UK) are being broken. Furthermore, I spoke to a BT Engineer at 44Con, and he told me that there's no copyright on SSID names - i.e. there's nothing illegal about pretending to be "BTOpenzone" or "SkyHome-AFA1". However, I suspect at the point where you start monitoring/modifying network traffic you may get in trouble. Interesting to note that in the USA a judge ruled that data interception on an open network is not illegal.
Q. But I run iOS 5/6 and they say this is fixed!! A. Mark Wuergler of Immunity, Inc did find a flaw whereby iOS devices leaked info about the last 3 networks they had connected to. The BSSID was included in ARP requests, which meant anyone sniffing the traffic originating from that device would be privy to the addresses. Snoopy only looks at broadcast SSIDs at this stage - and so this fix is unrelated. We haven't done any tests with the latest iOS, but will update the blog when we have done so.
Source : sensepost.com
Source : sensepost.com