What Is Culling?

Your client has dumped terabytes of data on you and now you have to go through it all and determine what is important to the case.   In the eDiscovery world we call this culling.  There is nothing really technical about it, culling is simply the process of removing content that is irrelevant while searching to identify content which is relevant.

When someone uses the term "culling" I often have the image of a miner with a pick.  That's exactly what this is.  You are looking for the important stuff.  To do that, you must chip away the stuff you don't need to get to it.  Chances are that over 75% of the data you collect will never be produced.  Hundreds of hours are spent every year reviewing data.  

Culling helps you in the front end to narrow down to data that you truly need.

There are generally three types of culling:

  1. DeNISTing - Yes, this is a big time techie term but this is the method of removing all of the junk data such as systems files or other file formats which aren't generated by the user.  
  2. Deduplication - Here's another tech term but this method identifies and separates out duplicate documents and emails.  This is either done globally (the entire data collection) or by custodians.  You don't want to review 10 copies of the same exact document in your reviews.  Each document has its own DNA and there are ways to carve out the duplicates based upon the DNA.
  3. Search terms - Once your are familiar with the case, you can create search terms to include (or exclude) data.
Culling takes time.   That's always a buzz kill to tell people that.   You aren't going to complete culling quickly.  Just like a miner using a pick ax, you have to chip away at it.  Do it in chunks. You have many ways you can filter the data.  You can eliminate some data by filtering out file types or filtering by dates.

You also don't have to process everything you collect.  I can't tell you the many times an attorney has asked me to just "scan everything in" first.  Certainly there is always a concern that something could be missed but you can also add it in later if it is important or conduct a random sample of data that list left behind.  

The beginning of any culling is planning.  Without a plan on how you will cull the data is not a good plan.  Here are a few questions that will help you plan:

#1 - What kind of data do I have? How many custodians?
#2 - What types of data will be important to my case?
#3 - Can we divide up the data for culling?
#4 - Can we do the culling in-house or do we need to out source it?

If you are unsure where to start or what to do, please contact a litigation support vendor for advice and assistance.  These people have the experience, software and resources to help you cull through the data. Even if you have excellent culling techniques in place, some data will need to be further evaluated.  

In the "old days" we culled banker's boxes of paper.  The only culling was manually by hand.  Today's technology is able to deal with the massive data dumps we receive.  Culling is still a necessary evil but planned and executed properly and you will be successful in finding the "gold" in you mine.  


Popular posts from this blog

What Is a PST File?

Game Changers In eDiscovery