<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Wibble &#187; privacy</title>
	<atom:link href="http://www.thewibble.com/tag/privacy/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.thewibble.com</link>
	<description></description>
	<lastBuildDate>Fri, 06 Nov 2009 22:45:49 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Binary Level Metadata in Microsoft Word</title>
		<link>http://www.thewibble.com/2009/10/06/binary-level-metadata-in-microsoft-word/</link>
		<comments>http://www.thewibble.com/2009/10/06/binary-level-metadata-in-microsoft-word/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 21:09:40 +0000</pubDate>
		<dc:creator>Jen</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[anonymity]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[Microsoft Word]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://www.thewibble.com/?p=79</guid>
		<description><![CDATA[The breakdown is this: give me a relatively recent Microsoft Word Document (.doc) and I can tell you what word processor last edited it.

I studied information leakage. There are a plethora of examples of cases where information that wasn’t supposed to be revealed, was. One could argue that it’s the user’s fault for not correctly [...]]]></description>
			<content:encoded><![CDATA[<p>The breakdown is this: give me a relatively recent Microsoft Word Document (.doc) and I can tell you what word processor last edited it.</p>
<p>
I studied information leakage. There are a <a href="http://www.casi.org.uk/discuss/2003/msg00457.html">plethora of</a> <a href="http://news.bbc.co.uk/2/hi/europe/4506517.stm">examples</a> <a href="http://www.sfgate.com/cgi-bin/article.cgi?f=/n/a/2009/02/10/state/n230703S73.DTL">of cases</a> where information that wasn’t supposed to be revealed, was. One could argue that it’s the user’s fault for not correctly sanitizing their documents, but I blame WYSIWYG editors. Back in the day of pen and paper, if someone wanted to redact information from a document she was releasing, all she had to do was take a black market and cross it out. For extra security, she could make a photocopy of the original and only release the photocopy. WYSIWYG editors try to imitate paper in that the document being edited is in theory the one being published, but especially with redacting information, there’s a failure to communicate to the user what’s actually going on. In a WYSIWYG editor, one can’t just put a black box over information to redact it. The same goes for putting a black background on text. The problem: the information is still there.</p>
<p>
The stories about information being incorrectly redacted are more high profile and glamorous, but metadata leakage can also be embarrassing. Metadata can be thought of as data about data. When you create a file, the program that created it stores some identifying information–for example title, author, date of creation. It stores data about the data you just made. I talked earlier about how technology can be seen as like magic and just working. Again, the problem is if one thinks of technology this way, privacy and security are never questioned. In this project I examined Microsoft Word Documents–one of the most common file formats for editing and publishing text documents. Word stores metadata and in a world increasingly worried about metadata, Microsoft offers advice on how to sanitize documents of metadata. While clicking around Microsoft’s help pages, I came across the following <a href="http://support.microsoft.com/kb/223396">snippet</a>:</p>
<blockquote>
    “Some metadata is readily accessible through the user interface of each Ofﬁce program. Other metadata is only accessible through extraordinary means, such as opening a document in a low-level, binary ﬁle editor.”
</blockquote><br />
<p>
Extraordinary means? Thus I set forth tying to determine whether this Computer Science undergraduate could find the metadata Microsoft referred to using “extraordinary means” (also known as the Unix tools <a href="http://linux.die.net/man/1/strings">strings</a> and <a href="http://linux.die.net/man/1/od">octal dump</a>).</p>
<p>
What I found was quite fun. Microsoft Word Documents (of the .doc variety–.docx is an entirely different beast) differ enough on the binary/octal level differ enough so that I can identify Word files created by Microsoft Office 2003, 2004, 2007, 2008, OpenOffice, and Google Docs. A quick tip on identifying Office version: Microsoft always releases the Windows version the year before the Mac version. Thus Office 2003 and 2007 are the Windows versions and Office 2004 and 2008 are the Mac versions. There are major differences in structure between Windows and Mac Office-produced Word documents and definitely differences between each version. Microsoft Office is a minor nightmare from a backwards compatibility standpoint, so I don’t blame Microsoft for having convoluted file formats (fun fact: Word documents alternate between UTF-8 and UTF-16 encoding). It turns out that when one version of Office (say 2004) opens and saves a Word file created by another version of Office (say 2003), the file structure will be converted from 2003 to 2004. It is possible to create an operating system neutral word processor though: I couldn’t tell the difference between OpenOffice Word files created on Windows computers or Macs. It goes without saying that OpenOffice and Google Docs produced Word files that look very different on a binary level from the Microsoft ones.</p>
<p>
I recognize that looking at Word documents at this close of a level is beyond most Word users’ abilities or desires, but I’m also surprised how easy it was to find differences in the file formats. Microsoft Word stores unintended metadata about what word processor you used to last edit a document. This is troubling since Microsoft has tools that are supposed to strip metadata from documents, but this just goes to show that metadata is embedded deep into documents. I’m guessing that one of the reasons Word moved to a .docx format was because .doc was becoming too cumbersome to deal with. It’s very possible that .docx is operating system and Office version neutral.  I definitely don&#8217;t think that Microsoft was sloppy in creating the .doc format, I just believe that in most moderately complicated file formats constructed in an environment where privacy isn&#8217;t paramount, there will be traces of hidden metadata.</p>
<p>
This was one of the two projects I did at Princeton.  The other, on RFID security, can be found <a href="http://www.thewibble.com/2009/09/30/rfid-and-smart-card-privacy-and-security-concerns/">here</a>.]]></content:encoded>
			<wfw:commentRss>http://www.thewibble.com/2009/10/06/binary-level-metadata-in-microsoft-word/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RFID and Smart Card Privacy and Security Concerns</title>
		<link>http://www.thewibble.com/2009/09/30/rfid-and-smart-card-privacy-and-security-concerns/</link>
		<comments>http://www.thewibble.com/2009/09/30/rfid-and-smart-card-privacy-and-security-concerns/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 21:21:33 +0000</pubDate>
		<dc:creator>Jen</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[princeton]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[rfid]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://www.thewibble.com/?p=52</guid>
		<description><![CDATA[

http://www.flickr.com/photos/midnightcomm/ / CC BY 2.0


Arthur Clarke once proclaimed that &#8220;any sufficiently advanced technology is indistinguishable from magic.&#8221;  Even as a Computer Science student, I find myself identifying with this idea.  Because I&#8217;ve studied more on the software side, I tend to think of hardware as vaguely magical black boxes.  When dealing with magic, [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: center;"><a title="Blue and Purple RFID tag by midnightcomm, on Flickr" href="http://www.flickr.com/photos/midnightcomm/171587228/"><img src="http://farm1.static.flickr.com/49/171587228_f78f978bd8.jpg" alt="Blue and Purple RFID tag" width="500" height="333" /></a>
<span style="font-size:60%">
<div><a rel="cc:attributionURL" href="http://www.flickr.com/photos/midnightcomm/">http://www.flickr.com/photos/midnightcomm/</a> / <a rel="license" href="http://creativecommons.org/licenses/by/2.0/">CC BY 2.0</a></div>
</span></div><br />
<p>
Arthur Clarke once proclaimed that &#8220;any sufficiently advanced technology is indistinguishable from magic.&#8221;  Even as a Computer Science student, I find myself identifying with this idea.  Because I&#8217;ve studied more on the software side, I tend to think of hardware as vaguely magical black boxes.  When dealing with magic, things are supposed to &#8220;just work&#8221; and we don&#8217;t question why because it&#8217;s all mysterious.  The problem with this thinking is that even if a technology works, it might not work well or have been implemented correctly, especially in terms of security.</p>
<p>
RFID is a magical technology&#8211;it&#8217;s commonly used enough so that people will know what it is, but not well-known enough for people to understand what it is.  If you&#8217;re unfamiliar with RFID, it&#8217;s the chip that can be found inside of some credit cards that forms the basis of &#8220;tap and go&#8221; payment.  RFID tags can also be found in many transportation system cards, like the CharlieCard (Boston) or the SmarTrip (D.C.).  RFID tags can store information (like how much money is on your card) and they communicate through radio frequency waves.  The radio waves are why RFID can probably work through your wallet but doesn&#8217;t if you wrap it in aluminum foil.  At Princeton, our student ids (&#8221;Prox&#8221; cards) have RFID tags inside them and students can use them to access buildings.  They add an extra layer of building security.</p>
<p>
Princeton&#8217;s security is based on our Prox cards, so I wanted to know how secure they were.  I used an off-the-shelf RFID reader (an Omnikey CardMan 5321, around $100) and open source software (RFIDIOt, free) to see what I could get out of the RFID cards I had, including a Princeton Prox card, a CharlieCard, and a Princeton Public Library card.  Luckily (or unluckily for me), the Princeton Prox card was an HID iCLASS card, which I found in my literature study to be one of the more secure cards on the market.  HID claims that it built in anti-cloning (copying a card) physical devices into the card.</p>
<p>
However, I discovered that hotlisting attacks were very possible with all three cards I had.  Hotlisting is an attack that involves tracking an individual through a unique identifier (UID), a number that was unique to that card.  Each of the cards had a UID that I could read with my unauthorized reader, and since it was a unique number, I could link it directly to that card.  Because each card is linked strongly with one individual, I could then track individuals if I had a point of reference where I could confirm their identity and read the UID off their card.  Reading a card&#8217;s RFID tag is very unobtrusive, especially when the cards are commonly used.  All it would take is brushing up against an individual&#8217;s wallet, and I would have the number.  This means that if I wanted to track an individual&#8217;s movements, all I would have to do is place a number of RFID readers in key locations, and obtain someone&#8217;s UID.  Since I could read the UID of all the cards I tested and considering the ubiquity of cards with RFID tags, I believe that most people are trackable.  RFID tags are also being found in items other than cards, such as library books and EZ Pass or related electronic toll payment systems.  As more cards add RFID tags, this will become a bigger issue.  Whenever you carry your card, you are followable.</p>
<p>
This was one of two research projects I completed during my junior year at Princeton.  <a href="http://www.thewibble.com/2009/10/06/binary-level-metadata-in-microsoft-word/">Here is my other project</a> on hidden metadata in Microsoft Word Documents.</p>]]></content:encoded>
			<wfw:commentRss>http://www.thewibble.com/2009/09/30/rfid-and-smart-card-privacy-and-security-concerns/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>CDT Blog Posts</title>
		<link>http://www.thewibble.com/2009/08/29/cdt-blog-posts/</link>
		<comments>http://www.thewibble.com/2009/08/29/cdt-blog-posts/#comments</comments>
		<pubDate>Sat, 29 Aug 2009 16:28:28 +0000</pubDate>
		<dc:creator>Jen</dc:creator>
				<category><![CDATA[Technology Policy]]></category>
		<category><![CDATA[Writing]]></category>
		<category><![CDATA[broadband]]></category>
		<category><![CDATA[cdt]]></category>
		<category><![CDATA[cybersecurity]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[real id]]></category>
		<category><![CDATA[rfid]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[ssn]]></category>
		<category><![CDATA[wiretapping]]></category>

		<guid isPermaLink="false">http://www.thewibble.com/?p=27</guid>
		<description><![CDATA[


http://www.flickr.com/photos/ghost_bear/ / CC BY-NC-SA 2.0



As a continued act of record keeping, here are the blog posts I did for the Center for Democracy and Technology on their PolicyBeta blog during my internship.  I had a great time there and learned a lot about Internet/Security/Privacy policy and how government really works.  I worked on several projects [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: center;">
<a href="http://www.flickr.com/photos/ghost_bear/2981281195/" title="Farragut West Wanderers by Ghost_Bear, on Flickr"><img src="http://farm4.static.flickr.com/3279/2981281195_30af13783c.jpg" width="500" height="333" alt="Farragut West Wanderers" /></a>
<span style="font-size:60%">
<div xmlns:cc="http://creativecommons.org/ns#" about="http://www.flickr.com/photos/ghost_bear/2981281195/"><a rel="cc:attributionURL" href="http://www.flickr.com/photos/ghost_bear/">http://www.flickr.com/photos/ghost_bear/</a> / <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/2.0/">CC BY-NC-SA 2.0</a></div>
</span>
</div><br/>

<p>As a continued act of record keeping, here are the blog posts I did for the <a href="http://www.cdt.org">Center for Democracy and Technology</a> on their <a href="http://blog.cdt.org/">PolicyBeta </a>blog during my internship.  I had a great time there and learned a lot about Internet/Security/Privacy policy and how government really works.  I worked on several projects at CDT, some of which resulted in blog posts.  One of my projects was writing the &#8220;CRS Report of the Week&#8221; posts.  CRS is the <a href="http://en.wikipedia.org/wiki/Congressional_Research_Service">Congressional Research Service</a>, the &#8220;Congressional Thinktank&#8221; that does policy reports for Congressmembers.  They produce <a href="http://en.wikipedia.org/wiki/Congressional_Research_Service_reports">CRS Reports</a>, which explain current legislative issues.  CRS Reports aren&#8217;t directly available to the public, which is interesting since CRS is tax-payer funded to the tune of $100 million a year.  CDT runs a project called <a href="http://opencrs.com/">Open CRS</a> which liberates CRS Reports found in the wild.  I wrote CRS Report of Week blog posts to illustrate how useful CRS Reports were.  They provide great introductions to topics and are often surprisingly timely.  Read one if you want to understand an issue.  I also worked on the Browser Privacy Report and PASS ID.</p>

CRS Report of the Week
<ul>
	<li><a href="http://blog.cdt.org/2009/06/30/crs-weekly-report-comprehensive-national-cybersecurity-initiative/">Comprehensive National Cybersecurity Initiative</a> [June 30th, 2009]</li>
	<li><a href="http://blog.cdt.org/2009/07/09/crs-weekly-report-the-social-security-number/">The Social Security Number </a>[July 9th, 2009]</li>
	<li><a href="http://blog.cdt.org/2009/07/17/crs-weekly-report-the-real-id-act-of-2005/">The REAL ID Act of 2005</a> [July 17th, 2009]</li>
	<li><a href="http://blog.cdt.org/2009/07/23/crs-weekly-report-access-to-broadband-networks/">Access to Broadband Networks</a> [July 23rd, 2009]</li>
	<li><a href="http://blog.cdt.org/2009/07/31/crs-report-of-the-week-privacy-law-and-online-advertising/">Privacy Law and Online Advertising</a> [July 31st, 2009]</li>
	<li><a href="http://blog.cdt.org/2009/08/07/crs-report-of-the-week-wiretapping-and-electronic-eavesdropping/">Wiretapping and Electronic Eavesdropping</a> [August 7th, 2009]</li>
</ul>
Projects I Worked on
<ul>
	<li><a href="http://blog.cdt.org/2009/08/05/cdt-releases-update-to-browser-privacy-report/">CDT Releases Update to Browser Privacy Report</a> [August 5th, 2009]</li>
	<li><a href="http://blog.cdt.org/2009/08/07/rfid-skimming-is-easier-than-you-think/">RFID Skimming Is Easier Than You Think</a> [August 7th, 2009]</li>
</ul>
<p>The photo is the <a href="http://en.wikipedia.org/wiki/Farragut_West">Farragut West Metro Station</a>, next to which CDT is located and where I got off every day.</p>]]></content:encoded>
			<wfw:commentRss>http://www.thewibble.com/2009/08/29/cdt-blog-posts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WWS 586F Class Blog Posts</title>
		<link>http://www.thewibble.com/2009/08/29/wws-586f-class-blog-posts/</link>
		<comments>http://www.thewibble.com/2009/08/29/wws-586f-class-blog-posts/#comments</comments>
		<pubDate>Sat, 29 Aug 2009 15:48:33 +0000</pubDate>
		<dc:creator>Jen</dc:creator>
				<category><![CDATA[Technology Policy]]></category>
		<category><![CDATA[Writing]]></category>
		<category><![CDATA[cs education]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[google earth]]></category>
		<category><![CDATA[griefers]]></category>
		<category><![CDATA[kindle]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[protect the children]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.thewibble.com/?p=17</guid>
		<description><![CDATA[
http://www.flickr.com/photos/joeshlabotnik/ / CC BY 2.0

As a matter of record keeping and curiosity, here are the blog posts I wrote for the seminar on Information Technology and Public Policy that I took at the Woodrow Wilson School.  Many of them, especially the ones on the Facebook Terms of Service and the Kindle 2, are now [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: center;"><a title="Woodrow Wilson School by Joe Shlabotnik, on Flickr" href="http://www.flickr.com/photos/joeshlabotnik/2218123758/"><img src="http://farm3.static.flickr.com/2259/2218123758_273f6f7192.jpg" alt="Woodrow Wilson School" width="500" height="321" /></a>
<span style="font-size:60%"><a rel="cc:attributionURL" href="http://www.flickr.com/photos/joeshlabotnik/">http://www.flickr.com/photos/joeshlabotnik/</a> / <a rel="license" href="http://creativecommons.org/licenses/by/2.0/">CC BY 2.0</a></span></div><br/>
<p>
As a matter of record keeping and curiosity, here are the blog posts I wrote for the seminar on Information Technology and Public Policy that I took at the Woodrow Wilson School.  Many of them, especially the ones on the Facebook Terms of Service and the Kindle 2, are now outdated due to events in the past months, but others are still relevant.  4 and 9 are topics that still stand today&#8211;4 is on how Computer Science education (especially at lower levels) could be improved and 9 discusses some of the tragedies that occur with what are normally positive traits of the Internet: the ability to disseminate information quickly, to keep sources anonymous, and to retain information for an indefinite amount of time.</p>
<ol>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=17">Facebook wants to own your life</a> [February 22nd, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=24">The Kindle 2&#8217;s Correct Copyright Claims (and the Authors Guild&#8217;s Incorrect Ones)</a> [February 28th, 2009]*</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=34">Blurring Google Earth</a> [March 7th, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=46">Computers in Our World</a> [March 28th, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=60">TVGuardian Will Protect Us All</a> [April 4th, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=70">The Kindle 2: A New Hope (for the disabled)</a> [April 11th, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=82">eBooks and mp3s</a> [April 18th, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=93">Protecting Children from the Indescribable Filth of YouTube</a> [April 25th, 2009]</li>
	<li><a href="http://courseblog.cs.princeton.edu/spring09/wws586f/?p=102">Grief From Griefers</a> [May 2nd, 2009]</li>
</ol>
<p>*Tim Lee (the tech libertarian) was in my class!  He <a href="http://techliberation.com/2009/04/09/princeton-students-including-me-blog-about-tech-policy/">approved</a> of this post.</p>]]></content:encoded>
			<wfw:commentRss>http://www.thewibble.com/2009/08/29/wws-586f-class-blog-posts/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
