Bay Area Computer Forensics Expert, Investigator & Witness
  • Home
  • Services
    • CLE
    • Intellectual Property Issues
    • Civil Litigation
    • Criminal Defense
  • About Us
    • Jon Berryhill
    • Katie Berryhill
    • Clients
    • Client Testimonials
  • FAQ
    • Hiring A Computer Forensics Expert
    • Resources
  • News
  • Contact
  • Home
  • Services
    • CLE
    • Intellectual Property Issues
    • Civil Litigation
    • Criminal Defense
  • About Us
    • Jon Berryhill
    • Katie Berryhill
    • Clients
    • Client Testimonials
  • FAQ
    • Hiring A Computer Forensics Expert
    • Resources
  • News
  • Contact

News & Computer Forensics Blog

Author Jon Berryhill

Computer Forensics Investigative Expert and Certified Expert Witness for Military, State and Federal Courts

What is Metadata?

11/11/2019

9 Comments

 
What is metadata?
by Jon Berryhill

To understand metadata, you first have to understand what the word means. The prefix “meta” means “beyond” and is used to indicate a concept that is an abstraction behind another concept. From this we get the meaning that metadata is the “data beyond the data”. In the world of digital forensics, metadata is the data and information that is part of or attached to some other more obvious piece of data. We usually think of metadata being associated with a particular file. Every file on a computer has some amount of metadata associated with it. The amount, type and usefulness of that data depends on the type of file and the type of investigation.


I usually break metadata down into two broad groups: internal and external. Every file on a computer or any digital storage media has some external metadata and most user created files have varying amounts of internal metadata.

On all modern computer systems the minimum metadata is the external metadata that consists of several date/time stamps that memorialize the file creation, last access and last written date/time. That information, along with the file name, is not stored with the file but rather in a table maintained by the operating system for each storage device (and stored on that device). I doesn’t matter if it’s a hard drive, thumb drive or SD card. Each storage device has a table, separate from the files, that exist for house keeping purposes. Think of it like a card catalog system in an old fashion library. The card in the little drawer has the name of a book, directions on where to find it and a small amount of other information about the book that would vary depending on the library system and the type of book. The tables maintained by the computer are similar. The table has the name of the file, various date/time stamps and directions for the computer on where to find the file. In this imaginary virtual library, the books don’t have covers that contain the sort of information you might expect to find if you just browsed the shelves and pulled out a book at random. You need to cross reference the book with the card catalog card to get the full picture. There is other stuff there too that is, usually, less interesting from an investigative standpoint.

Even this most basic metadata can be misinterpreted and is often misunderstood. Top on this list is the file creation date/time. This date/time is not what you might think from the simple name. What it is, is when that particular file was first written to the storage media we see it. The other basic date/time stamps are a bit more straight forward in their meaning. Last written is the last time the file was saved for any reason, not necessarily when it last changed, but just the last time the “save” button was hit or an auto save feature was engaged. The last access date/time is simply the last time the file was touched for some reason. There is no information here about what, who, why or what software tool took the action. The two most common reasons would be that the file was either opened or copied.

How all this metadata is interpreted is critical and often requires some explaining. First we have the intuitively backwards situation that can occur where we could have a file that has a last written date/time of yesterday and a file creation date/time of today. Doesn’t a file have to be created before it was last saved? This is a common scenario that happens when a file is moved from one computer to another. If you created a file on computer A yesterday and then today copy that file from computer A to computer B (assuming you make no changes to the content), the last written date/time (in most, but not all cases) will carry over to the new computer but the file creation date/time will not. By copying the file from A to B, you have created a new file on B and it will have a file creation date/time that reflects when that action took place. In most cases the act of this file copy would also cause the last accessed date/time on computer A to be updated. Further, if we examine the time stamps on the two computers we can figure out exactly when the file left (a copy of it actually) A and when it landed on B. A gap in the timing could even indicate that an intermediate storage device may exist (which mean yet another copy of the file is floating around out there somewhere). If the last access and file creation date/time on computer B match, that is a pretty good indication that nothing has been done with that file since it was first copied to computer B.

Just from this metadata it is possible to put together a great deal of valuable information about what a user may have been up to with their computer usage.

One of the tell tail signs often seen in trade secret theft cases is the mass copying of files from a company computer to some other device, most often a thumb drive these days. Not too long ago I examined a computer used by an employee suspected of trade secret theft. On the employee’s last day of work he came in early (6 am). The computer metadata last access showed that at 6:10 am all of the files in the directory structure containing all of the companies business and client records were accessed within a 4 minute time frame. The number and size of the files were such that the time intervals were consistent with the amount of time that would have been necessary to copy those files to a thumb drive. This is not proof of a copy process but it is a very strong indicator of a mass copy process. In this case other evidence locked down the full picture. A analysis of the computer’s operating system found log data that shows a thumb drive was attached to the machine at 6:05 am and we discovered later that the employee used his company Amex credit card the day before to purchase a thumb drive from the local office supply store.

Not all cases are this clear cut. There are a number of things that can muddy the waters. There are ways and certain conditions in which some actions can take place and not have an associated metadata date/time stamp updated. In some cases, certain metadata simply does not exist. For example, on a typical floppy disk (yes, there are still plenty of them around), the last access date stamp is just the date. The data structure does not record a time.
​
​
Another wealth of information can be found in internal metadata. This is information stored within the file its self. The most common files here are things like Word documents and Excel spreadsheets. Then there is the whole world of metadata found in image/photo files like jpegs. This metadata will vary depending on the file type. Some of this may be pretty well understood by many users. Things like the name of the document author or creator company name. This is the sort of thing you can see if you look at the properties of the document from within the application. But there is usually a lot more information that is not displayed with the properties lookup. A forensic examination might also show things like the network storage location the file was saved to when it was created or what make and model printer was setup on the computer that created the document. There is even a “feature” (since removed much to my investigative disappointment) on some older versions of certain office document creation applications that would embed a unique value into a document that would indicate the actual, specific, individual computer used to create the document. You can’t use that information to track down the machine in any sort of remote way but it can be used for comparison and/or elimination purposes to other machines and documents that have been examined.

I used this “feature” to great success to prove the facts of a case involving records tampering and perjury of several government officials when it came to their claims about where and when and on which computers certain documents were claimed to have been created.

The wider use of document metadata can also be applied to very large quantities of data as well. Using a combination of existing tools and software I wrote, I have been able to capture document metadata from large corporate network storage systems when searching for evidence in trade secret theft contamination cases. In these situations an employee inappropriately obtained material from a competitor, made some alterations to the files, and then distributed the “new content” within their company for their use. What didn’t get changed was some of the internal metadata that was still traceable directly back to the victim company. I was able to determine the true extent of the “contamination” and could then use that information to not only show the extent of damages but also what needed to be done to clean up the mess.

Lastly (at least for this brief explanation), is the topic of metadata found in photos. Digital cameras of all sizes down to the phone in your pocket, embed a ton of information into the saved picture file. This includes everything from the exact time the picture was snapped, camera settings, GPS coordinates of the device, the make and model of the camera and even in some cases the serial number of the camera. The exact data does vary from device to device but you get the idea.

The sky is the limit when it comes to uses of this information. It all depends on the circumstances of the investigation and the imagination and resourcefulness of those conducting it. The other key point is that all of this data is, to varying degrees, alterable and/or perishable. The source of the data and how it is collected and handled is just as important as its proper interpretation to determine its use, authenticity or to find evidence of tampering.

Categories

All
Computer Forensics Expert
Computer Security
Cost
Cyber Crimes
Digital Forensics
Employee Data Theft
Expert Witness
Hiring Experts
Law Enforcement
Legal

Archives

November 2019
July 2019
January 2019
October 2018
May 2018
April 2018
March 2018

9 Comments
Alan Hill
2/15/2021 02:45:07 pm

Interesting article Jon. It is a shame that the 'feature' was removed.
I was trying to find information on Creation dates for a Word document and how they could be modified. I consider it a simple task to modify the date by changing the system clock & doing a 'save as', or with an app to modify the file attributes. Would you agree?
I am having difficulty with my forensic expert in my law suite, to get him to admit that a creation date could be modified without trace. Regards, Alan

Reply
Jon Berryhill
2/15/2021 03:56:23 pm

Thanks for the comments. If you have problems getting your "expert" to agree with something so basic.... maybe you need a different expert. Give me a call if you have questions about a specific case.

Reply
Finley Hunter link
3/8/2021 03:50:18 am

Really like these new tips, which I haven't heard of before, like the Computer Forensics Expert. Can’t wait to implement some of these as soon as possible.

Reply
Tez link
4/28/2021 06:18:15 pm

I like your metadata tips. I need to make my facility more secure. I'll have to get a consultant to know where to beef up security.

Reply
Diya
3/20/2022 06:21:24 pm

The article was easy to understand thank you!!

Reply
Jennie Bethel
5/17/2022 11:51:47 pm

Do you know how much it cost to find meta data in a 8 page document for patient medical. And believe the place falsified my medical records and was seeing how much getting a copy of the metadata could cost.

Reply
Wondering link
6/8/2022 03:31:23 am

Hi There,

Can metadata and hash values be overwritten and unrecoverable like images of documents and photos, or is everything pretty much recoverable if you have the right expertise and equipment?

Also, on the properties menu there are options such as "encrypt contents to secure data" and "remove properties." Regardless of these options, I'm assuming that experts can still obtain the data, like metadata, so what's the point of having options like that? Is it just to provide users a false sense of security that their information is being protected? Not that I care if criminals have a false sense of security, but I find it interesting. Thank you

Reply
Jon Berryhill
6/8/2022 09:52:41 am

Keep in mind what "Metadata" means. It is data about data or "beyond the data". Different file types have different types of metadata. Some of it is stored within the file in question. Other data may be external to it.

Reply
Maryland Facesitting link
11/14/2022 01:48:59 pm

Hello mate great blogg

Reply



Leave a Reply.

demonstrated experience . proven results


Home

About

Services

Contact

Berryhill Computer Forensics, Inc.   TX 6-853-249  All Rights Reserved.
Text and content on this site may not be used without written permission.
Copyright © 1997-2023