By Dennis Kennedy, Evan Schaeffer and Tom Mighell
January 2006
Dennis Kennedy (DK): There's been a lot of interest lately in metadata - the hidden data about documents usually thought about in connection with Microsoft Word. In a recent story, edits in a United Nation's report were revealed, with embarrassing consequences. These stories have occurred with some regularity over the past several years. Have either of you had any experiences with metadata exposure or favorite stories?
Evan Schaeffer (ES): I was amused when the metadata in a supposedly anonymous memo criticizing Judge Alito's nomination to the Supreme Court revealed the authors to be certain members of the Democratic National Committee. The memo was created in Word. According to news reports, the document's metadata revealed the authors' names and the date when the file was first created. It was well before Alito was even nominated.
Tom Mighell (TM): I have received copies of two PDF pleadings from attorneys in other cases, where whoever converted the file to PDF forgot to turn off the Comments or the additions/deletions. As a result, all of the changes were listed down the right-hand side of the page. No sensitive information, but it sure was embarrassing to the lawyer.
(DK): I haven't been involved in any metadata disclosures that have become public, but seen quite a bit of metadata people haven't expected me to see over the years. I've also seen some metadata in my documents that I didn't expect and I would have preferred not get out into the wild. However, we had better tell people what metadata is. In the classic definition, metadata means data about data. I'm not sure how much that definition helps us. I think it's easier to think of metadata as hidden data contained in or associated with documents and computer files.
(ES): On the other hand, metadata really isn't "hidden" if you know where to look for it. That's one of the points of the recent controversies. And shouldn't the definition of metadata be general enough to include information like the tags embedded in mp3s that identify the song title and author? That information certainly isn't intended to be hidden.
(TM): And although I know that metadata is "hidden" from the casual observer, I don't like to think of it as if it is super-secret, confidential information (although it sometimes is). Metadata is intended to be helpful to the user -- to provide additional information about a document or file and its creation. Being able to keep track of a document's authors, comments, changes, and edits is undeniably useful. Unfortunately, it may also be useful to an opponent who can use it in litigation.
(DK): I like to say that metadata is not inherently bad. It is essential in enabling collaboration on documents. It's also important to realize that the term "metadata" is used in several different ways. In knowledge management, databases and information architecture, metadata refers to useful information that can help us find, maintain and use information in useful ways. Lawyers tend to refer to metadata in a specialized way that refers more generally to hidden data. Lawyers also tend to use "metadata" with a negative connotation. In fact, Evan, isn't there a growing fear about metadata out in the legal community?
(ES): There definitely is. It's the fear that you might have overlooked some key piece of information-overlooked it because, like you said, it appeared hidden to you-and then accidently turned it over to someone who can use it against you. That's exactly the sort of thing that's generated all the media interest in metadata.
(TM): Every technology conference I attend talks about metadata, and I am pleasantly surprised at the increasing number of lawyers who know what it is and why we need to be thinking about it. What is not understood as well in the legal community is how to get rid of it.
(DK): Well, Word documents give us the best place to start and some of the best examples. People need to realize that metadata can be found in many documents created by many programs. Let's talk about three standard examples. Let me note that we use the term metadata in connection with easy to find data about files, not data that a computer forensics expert might dig out. First, there is the metadata created more or less automatically when you create, work on and save documents. Second, there is the metadata associated with "track changes" that shows your earlier revisions. Third, there is some metadata associated with the "undo" function or "fast saves" that can sometimes show deletions and changes. Let's start with what we might find if we look at the document properties of a Word document (File > Properties).
(ES): It just so happens that I have a Word document open on my desktop right now. When I look at the file's properties, I see that the "author" is listed as my law partner. She's never worked on the document but I'm using her computer. That's an interesting example of how metadata can be wrong. The metadata also tells me when the document was created and how long I've spent editing it-way too long, if you must know.
(TM): My Word metadata is essentially the same, but Evan's answer reminds of something that happened to me with a Powerpoint file I recently used. I had saved it as a copy of a presentation I had recently given with somebody else, who had actually created the original presentation. The metadata from the original file traveled with my copy, so anyone looking at it would think it was created by my co-presenter.
(DK): Metadata can be useful or even innocuous - in your hands. However, it can be quite revealing in someone else's hands. There are many examples of problems caused by this simple information. Someone might find that the document was created by another law firm, created for another client, created and worked on by someone other than whose name shows up on your bill, and worked on for much less time than the hours shown on the bill. It's the context that matters. That'll give people an idea of the first category of metadata. How about everyone's favorite category - track changes?
(ES): It's certainly not my favorite. That's because it can be used to uncover my nearly-illiterate first drafts-what I call my "chickenscratch." I like to think that once I've saved over a first draft, it's gone forever.
(TM): That's why I rarely use the Track Changes feature -- I don't want to have to worry about prior drafts of my documents. As long as you have Track Changes turned off, you can avoid this problem. I generally only use Track Changes when I have lengthy, complicated document that I am working on with others in my firm or opposing counsel in a case. In those cases, I'm not all that worried about what the others can see -- if I were, I wouldn't be collaborating on the document with that person.
(DK): The track changes feature can be quite fun to use. It's useful when collaborating on documents, but the fun comes in turning it back on in documents and seeing the earlier revisions. Sometimes people leave it on in documents and you almost can't avoid seeing revisions. This problem is the one that gets your document in the newspaper and can result in embarrassing and potentially costly consequences. If you want to get the full attention of lawyers, show them how the changes in what appears to be a clean document can be revealed.
(ES): It brings to mind another recent news story, this time involving the Vioxx litigation. A few weeks ago, the editors of the New England Journal of Medicine published an editorial in which they claimed they had discovered that "data about three Vioxx patients who suffered heart attacks were excised from a crucial study sponsored by Merck." The quote is from a Wall Street Journal article about the allegations, which continued, "Using word-processing software to track down when changes had been made and by whom, the editors found out the data had been taken out two days before the article was submitted to the journal . . . They determined the deletions were made by someone working from a Merck computer." That's a perfect illustration of your point, Dennis.
(TM): Correct me if I'm wrong, but there's a very easy way to fix this problem (aside from using a metadata remover, like Payne Consulting's Metadata Assistant) -- just go to your Reviewing Toolbar and Accept or Reject the changes -- you can Accept or Reject All, or be choosy about your changes. Once you've dealt with all the changes, you won't see anything if you turn Track Changes on again. Keep in mind though, that there's a difference between Track Changes and Accepting/Rejecting Changes. As long as you properly use the Accept/Reject Changes feature, you can successfully remove your edits from a document. Not so if you just "turn off" the Track Changes option.
(DK): You're right, Tom, but you have to check all the right boxes and be very careful about how you do that. Computer forensics experts might disagree with us, because they believe they can find anything. I recommend that everyone reading this article watch the Microsoft demo on this topic at http://office.microsoft.com/en-us/assistance/HA011993241033.aspx. The third category is more esoteric. It tends to be a problem in older versions of Word. The idea is that the undo and "fast save" features associate certain information, often deleted material, with a document. In some cases, and this was in the old days, but many law firms still use old version of Word, you would get a document and find that the "undo" was still "live." If you knew what you were doing, you might also find evidence of earlier revisions in documents. It was fun. I don't think that you can do most of those things any more.
(ES): Fun? It depends on your idea of fun. But it's true that many get pleasure uncovering undocumented bugs in Microsoft products. Your example sounds like a bug to me.
(TM): Gee, Dennis, you stumped me with that one -- I had not heard about it. I know that allowing Fast Saves in Word has traditionally been a way that metadata can be saved, but it's pretty easy to turn that off: Just go to Tools, then Options, then the Save tab, and make sure that Allow Fast Saves is unchecked. This issue is not as big a deal with newer versions of Word, but for lawyers who are still using older versions, this is a good precaution to take.
(DK): It shows the potential problems that can arise when you are not familiar with your default settings in Word. So, that's what's out there. The real question, of course, is what to do about it. There's good news and bad news. The bad news is that some purists might argue that you can never be sure that you fully eliminate metadata. The good news is that you can easily learn the main issues and reasonable ways to address the most common issues. Even better news - after you learn enough to protect your documents, you can turn your skills to finding out what is in other people's documents.
(ES): Can I back up a bit? If you're talking about documents received during discovery, it's important to keep your opponent from removing metadata before turning documents over to you. What about a situation in which your opponent is willing only to give up scanned images but not source files?
(TM): Well then, it comes down to how educated the judge is on this whole metadata thing. But recovering metadata in your opponent's documents is important not just because it might reveal an edit or change that's damaging to their case. The underlying metadata -- in e-mail, for example -- will also show information on when the mail was sent, who received it, and the path it took. This will help to reconstruct the chain of communication on a particular issue.
(DK): And you need to know whether to insist that documents not be scrubbed before they are turned over to you. But, let's focus on some practical steps you can take. Here's the first step: learn how to use the programs you use. Learn to look at your document properties (File > Properties). Learn how "Track Changes" works. Learn the specific issues for the versions of the software you use. That information is readily available. There's no better starting point than looking at the properties tabs in some of your documents.
(ES): That's something I definitely don't do enough. I plan to start, though. The recent news stories we've been discussing should cause everyone to take a closer look at the metadata they're generating.
(DK): I'm going to talk about Word 2003 because that's what I use. You want to dig into your settings - don't worry, it's easy and relatively painless - and make a few simple changes. Go to Tools > Options > Security and find the Privacy Options on the Security tab. Check three boxes: (1) Remove personal information from file properties on save; (2) Warn before printing, saving or sending a file that contains tracked changes or comments; and (3) Make hidden markup available when opening or saving. Even though not necessarily self-explanatory, I think you'll understand the differences these settings can make. Next, go to Tool > Options > Save and check the item called "Prompt for document properties." This setting pops up the document properties tab when you save a document, allowing you to delete information that might be automatically entered into the properties.
(ES): After hunting around a bit, I found some similar tips for Word Perfect users. They're contained in a white paper published by Corel titled "Minimizing metadata in Word Perfect 12 documents." [Link: http://www.corel.com/content/pdf/wpo12/Minimizing_Metadata_In_WordPerfect12.pdf]. The steps are more complicated that the ones you've outlined for Word.
(TM): And while we're throwing out links, here's a good, quick reference on privacy options in Word: http://office.microsoft.com/en-us/assistance/HA011403121033.aspx.
(DK): The "Make hidden markup available when opening or saving" setting may reveal hidden data in documents you receive without you doing anything else.
(ES): It's surprising that it's so easy to get to the metadata if you know how to do it.
(TM): Dennis, that setting is essentially an override for the "Show" function in the Reviewing Toolbar. So regardless of the settings you have selected in the "Show" menu, those items are still going to be displayed if you have the "Make hidden markup..." box checked in your options.
(DK): Unless you really know what you are doing, it's simply not a good move to send out documents in which track changes have been used. My approach to handling metadata is to divide documents into two categories. The first is those documents that you want to share and collaborate on with others who are "on your side." In those cases, metadata is generally a good thing because it helps you work on the document with others. These documents are under your control. The second category is documents sent to people not on your side. In these cases, metadata can be damaging. I think of the second category as if I am "publishing" the document. Once you think in terms of publishing, you better understand what tools to use.
(ES): As someone who has already admitted to not being as careful with my metadata as I'd like, I'll be the first to say your "publishing" concept helps.
(TM): I agree with that as a general policy, but I don't have any problems using Track Changes with people "not on my side" when we are working on a document together -- for example, a settlement agreement or Agreed Judgment. A redlined document, especially when it's lengthy, is much easier to review, to see what your opponent has added or changed. I guess though, by using your "publishing" concept, then my opponent is really on my side for this purpose.
(DK): Again, I think in terms of publishing and publishing a "clean" document. The classic advice was always to save a Word document in the RTF format or in the PDF format. If you wanted to let the other side edit the document, you would save it as an RTF file. If you didn't want to allow edits, you saved as a PDF. That's still pretty good advice, although Adobe Acrobat can introduce its own metadata because you can add comments and revise documents. It gets complicated because metadata is so useful in collaborating on documents. There are other approaches and many people like to use a metadata cleaner. What do you guys like to use?
(ES): If I'm sending a document to someone I'm not collaborating with, I rely on PDF. That's my only trick. Though I've never used a metadata cleaner, you've got me thinking that I should try one out.
(TM): I don't use a metadata cleaner; like Evan, if I am sending a document out to opposing counsel or someone I'm "publishing" to, it goes in PDF. But because I don't often use Track Changes, and because there's nothing else in the metadata that would concern me if it were discovered by others, I confess I don't worry too much about cleaning out metadata. Of course, I have the appropriate boxes checked/unchecked in Word, but that's about it.
(DK): Donna Payne's Metadata Assistant is inexpensive and is probably the de facto standard in the legal industry these days. Microsoft has a free "Remove Hidden Data" tool you can download. Many experts do not like the Remove Hidden Data tool - you need to be aware of its limitations. You might decide it's OK for everyday, low-level documents. However, I suspect most of our readers would rather not be on the spot defending the use of a free tool with highly important and sensitive documents.
(ES): Besides, once you've been identified as having used the "remove hidden data" tool, you're instantly branded as someone who saw a need to remove hidden data. Have you heard the story about the guy who cleaned up his subpoenaed hard drive before turning it over to investigators by using the "evidence eliminator" tool? Reportedly, the tool left some metadata of its own, including metadata that identified the product. You can imagine what happened next.
(TM): There's another program I have heard good things about -- iScrub, from Esquire Innovations, has gotten pretty good reviews.
(DK): There are several other programs as well. You might want to look into the different options. It's worth mentioning that "user error" can negate the value of even the best tools. There have been cases of saving a document as a PDF when the track changes are revealed. That's the same thing as sending a printed copy of a document with the tracked changes showing. Oops. Don't forget the belt-and-suspenders approach - scrub the metadata and then save the document as a PDF file. That will give you some confidence about protecting your documents and then you can move on to the fun of metadata - revealing it in other people's documents.
(ES): Good advice, but it shows how difficult and confusing metadata can be-even your own metadata. When you add in the additional layer of a secretary, things get even more complicated.
(TM): I was just going to say -- often, it's the assistant or paralegal who is e-filing the document, or forwarding it on to opposing counsel. If they don't have the same basic education about the dangers of metadata, then it may not matter what the lawyer does to protect the document.
To Be Continued ...
Open Mike Archive >>
|