What is Dark Data?
With the rise of the concept of ‘Information is Power’, everyone started accumulating tons of information and storing many data. Just as we never bother to clear our closet unless there is the danger of space shortage, similarly, organizations accumulate many data on a daily basis that is stored on ‘might be required someday’ basis. In fact, after it has been preliminary used, no one knows what to do with such data or information. Such data that is harvested beyond its preliminary use without any specific purpose is termed as ‘Dark Data’.
Dark data was defined by Gartner as, “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes”. Hence, dark data is a quite broad and inclusive. All the information that has become obsolete and has no immediate use, but is being retained as ‘past records’ can all be termed as dark data.
How Dark Data Matters?
The ever growing size of the data brings the storage issues. On the other hand, the obsolete data does not really have much of the tactical value and it is costly to maintain. Furthermore, such information slows down the servers and it is a challenge to the speed as well.
Might Be Useful OR Might Be a Potential Threat
However, one of the biggest challenges associated with all kinds of data is, its data security. While data security is a massive responsibility, the challenge becomes even bigger when the data is unsorted or obsolete for the organization.
More than often, such data has proprietary information, client details, post employee records, and other sensitive information. While no one in the organization is interested in investing time or energy in the so-called, obsolete data, but such information is Gold mine for the hackers. While the cyber-attacks are on the rise, to pile on information that has potential risks of security is not a smart move. Dark data must be managed in order to save the cost as well as potential embarrassments via data breach.
Big and Dark Data Statistics:
- Business data of all the organizations worldwide – multiplies twice every fifteen months.
- Harvesting poor and unorganized data costs businesses up to 35 percent of their operating cost.
- The amount of harvesting poor and unorganized data costs $600 billion per year to the US businesses alone.
- As per execs, the inflow of data increases additional strain on present IT infrastructure.
- 55 percent of IT managers report that the additional data accumulated slow down the present IT infrastructure.
- 47 percent of the IT managers report data security issues due to the accumulation of additional and unorganized data reported.
Should You Be Concerned About Dark Data Accumulation?
Usually, the ideology that ‘information is power’ does not let the organizations part with their dark data, the fact is they have no plans to use it. The staunch belief system that ‘someday it might be lucrative’ and get us a competitive advantage over our competitors that keeps the organizations hooked to massive data buildup.
To store the data that is constantly being accumulated, organizations have just a few economic options and massive cloud storage seems quite advantageous.
The organizations who utilize cloud services to park such data and the organization that accumulates and store such data on their websites must be concerned.
Dark Data and Information Sensitivity
The concern should be grave if the data being parked on a third party cloud server that contains any sensitive information related to
- Present or potential customers’ information, patient records – financial or personal information,
- Financial transactions, bills with bank details and contact information,
- Employment records of past and present employees,
- Sensitive business information such as business practices, important partnerships or amalgamations, ex-joint venture details and competitive advantages, etc.
The management has very little control over the safety of third-party cloud servers. The risk of data theft is very high and the data is parked on the internet. So It is advisable to use renowned cloud service provider, who has good reviews for its service.
Apart from the loss of data, the organization may have to face legal and regulatory challenges. If intellectual property is stolen, it can result in financial losses. By its very nature, the dark data is complex and not so easily manageable. Hence, the calculation of the potential value of such information is difficult to be calculated precisely.
Minimize the Security Risks against the Dangers of Dark Data
Data collection: Set data collection system in place. Categorize documents as per their importance and data sensitivity then store accordingly. It is necessary to understand where the dark data is stored, its level of exposure and security, etc.
Design a retention policy: With the passage of time, retention of the data becomes not just costly but risky as well. Design a document retention and disposal policy in good faith. Construct the policy as per legal guidelines.
Defensible disposition is legal and hence, you never need to store obsolete data. As per FRCP 37 (e) i.e Safe Harbor Defensible disposition is allowed if you fail to provide electronically stored information that has been already deleted.
Data encryption and protection: Whether you store data on your drives, servers or on third party clouds, just ensure that they have ample of protection. Use of authentication to access privileged or archived information. Data encryption, use of valid encryption certificates, and mandatory strong passwords, etc. are a few steps that every organization and individual strictly abide by.
Educate your staff: Educate your staff on safe retention and disposal of data.
Ongoing inventory assessment: Do not invest too much in finding disposables. If you already have harvested many data that needs to be categorized and managed, take a lesson from DuPont. DuPont incurred a cost of $12 million to find out over a period of 3 years that 50 percent of its legal documents were disposable.
Employ latest tools and technology to extract data that carries important information.
Safe disposal: Data disposal is as tricky as data retention. Set rules to dispose of data and employ only authorized personnel to delete the data. Ensure that the data is thoroughly as well as carefully disposed. A human error such as disposal of information in wastebaskets is one of the key factors that lead to security hacks. In case of doubt about the data, it must be retained or disposed, conduct a sampling test.
If data is a valuable asset to you and your organization, do not treat it like garbage. Think before employing cheesy security measures and consider profoundly before disposing data onto a crumby source of storage. The security of your data, including dark data, is your responsibility that might be holding on to a few strands of your reputation as well. Stay safe!