When you upload documents, such as PDFs or Word files, to your website you may be unwittingly divulging information that could prove useful to hackers and other outside parties. If you’ve had any sort of vulnerability assessment performed on your website, you may see a reference to this metadata existing on your website. While it’s a relatively low risk threat, to stay in the good graces of your risk department, you’ll likely want to remove this data from documents before you publish to your website.
What is metadata?
Metadata, in this situation, refers to extra information embedded into a document when it’s saved, such as the author’s name, department name, keywords used for indexing and searching, and sometimes file paths to internal network resources. PDF documents are by far the most common type of file that is likely to contain metadata on a bank and credit union website. To view the metadata associated with a PDF, open the document in Adobe Acrobat Reader and then click the File > Properties menu item (or CTRL+D). The screen shot below is the first PDF that Google returned to me when I did search for ‘bank rates PDF’.
Example of metadata in a PDF file on a bank website
You can see that the author’s name associated with this document is Gerry Pfeffer. While it’s no secret that Gerry Pfeffer is somehow affiliated with this bank since the information is readily available elsewhere on the internet, there is a good chance that this will still show up on a vulnerability assessment and will likely fall under the heading of something that sounds ominous.
The easiest method to prevent accidental disclosure of metadata is to not insert it into your PDF documents in the first place. In Microsoft Word 2013, when saving a file as a PDF Word will present you with an Options button. When you click this button you can uncheck the box to include the document properties, otherwise known as metadata, and your PDF will not contain any of the metadata that Word normally saves to the PDF file.
Uncheck the box Document properties box
If you plan to publish Microsoft Word documents to your website (we highly recommend creating PDF versions instead), you can also remove the metadata from the Word file before you save it. Microsoft has instructions for removing meta data here.