Ever heard of MetaData? Wikipedia describes it best:
Metadata (meta data, or sometimes metainformation) is “data about data”, of any sort in any media.
So I hear you thinking: who cares? Well, for starters: you should.
MetaData contains a lot more information than “data about data”. Documents such as .PDF, .DOC, .XLS, .PPT, … contain information such as
- Revision history of files (in case of Word documents)
- Usernames of the person creating/editing the file
- Paths to where the file was/is located
- Software version used (Word 5.0, Word 10.0, …)
- Public network shares
- …
If you’re still saying “so what?", ask yourself the following question: should this data really be public? Should everyone really know my username to my computer? Or everyone who contributed to a certain file? Or where I saved it, and what software I used?
If I were a malicious person, I could use that information for a targetted attack: I can send you a phishing e-mail, with the name of some of your colleagues in it, or one of those names as the FROM-address, so it looks legitimate. I could use that software version number to attach a very specific software exploit, so I can gain control over your system. I can use your username to brute-force your password.
See a trend there? The MetaData is giving out a lot of info that can be abused, and there are plenty of ways to get it. Consider our good friend Google for a second, they have some very nifty filters you can use in order to search efficiently. Ever searched for the string “site:microsoft.com filetype:doc"? It gives you a list of all .DOC files, found on the microsoft.com site.
Guess what information is in those files?
Revision info, for everyone who worked on a file:
revision history – Revision #7: Author ‘benjaxxx’ worked on "
revision history – Revision #6: Author ‘waly xxx’ worked on "
revision history – Revision #5: Author ‘Steve xxx’ worked on "
revision history – Revision #4: Author ‘waly xxx’ worked on "
revision history – Revision #3: Author ‘waly xxx’ worked on "
revision history – Revision #2: Author ‘waly xxx’ worked on "
revision history – Revision #1: Author ‘waly xxx’ worked on "
revision history – Revision #0: Author ‘waly xxx’ worked on "
Paths used in that computer:
H:\SQL\SQL70_sp2\Langs\Spanish\updated_Readme_Localised\test\
\MULTIMED-SERVER\WWWROOT\Peru\ftpfiles\
C:\WINDOWS\TEMP\
\Dolphin\adcu\IDEAS\
And the list goes on!
By using publicly available information, I can get enough information to get an idea of the internal layout of a company. And I haven’t even set foot inside it yet. Tools such as Metagoofil simplify the act of getting this information, by searching Google for you – and extracting the metadata.
H:\SQL\SQL70_sp2\Langs\Spanish\updated_Readme_Localised\test\ \\MULTIMED-SERVER\WWWROOT\Peru\ftpfiles\ C:\WINDOWS\TEMP\ \\Dolphin\adcu\IDEAS\