Why Big Data Needs Big Protection
There has been a growing trend in recent times to collect and manipulate very large datasets – Big Data. Organisations do this to enable data mining of the big data dataset to extract key information relevant to their business.
As the move to cloud computing gathers momentum, the dangers of holding big data correspondingly increase because of the increased potential for loss or theft of vital company data. This is particularly true of data held offsite by a cloud computing hosting service.
What is Big Data?
A good working definition of big data is a dataset or combination of datasets that are too large to be processed using conventional tools. That could be either because of the sheer volume of data, or because the organisation lacks the processing power to be able to process it in the time required to use the results.
Because of the volumes involved, it is unlikely that the big data will be held on a customer cloud platform like OneDrive or Dropbox. However, having said that, what could be considered as big data is relative to the size of the organisation, and a small business may well be able to use the business versions of OneDrive or Dropbox.
What is common to both small and large organisations is that big data will contain vital and confidential company information, and often data that must be protected under the local data privacy legislation. That brings with it the need to protect that data against loss or theft, particularly if it is stored in a public or private cloud computing environment.
Why Protect it?
The requirement for big protection lies in two principal areas, in the use of big data in compliance with local regulations, and in the expected one of IT security policy and procedures.
Recent changes in compliance and data protection regulations, particularly in the European Union with the GDPR regulations that came into effect recently have put a new twist on the need to protect big data and manage its use. GDPR looks at both the source and application of big data, and how it is to be handled, among other things to minimise the possibility of identifying individuals.
This is also applicable in the US where a retailer was prosecuted for using big data analytics to identify individual shoppers who were pregnant and to estimate their due dates. They then marketed their goods and services to these individuals. In some cases, the individuals did not wish details of their pregnancy to be known, resulting in embarrassment and a damaging family situation.
The problem here did not lie with the IT security processes, but with the use of data analytics by the marketing department. They were unaware that their actions were in violation of data protection regulations.
What to Do
The first step in big protection is therefore in raising awareness of the local regulations – what data you can use, and how you can use it. As with other security and compliance issues, a programme starting at induction to the organisation, and reinforced with regular updates is needed to make sure that everyone working with big data understands their responsibilities in respect of data management.
In terms of IT security that again will fall into two categories, firstly a Data Loss Prevention plan to ensure that data is backed up and can be recovered in the event of data loss. A data loss might not necessarily be as a result of malicious attack, it could be as simple as an untested fix to a system.
The second category is to ensure that the data is, as far as is possible, secured against loss or theft by a malicious attack or by a deliberate or inadvertent user error. Unfortunately, conventional IT security measures are not enough for the mega‑sized datasets of big data, they are just not scalable enough or flexible enough. IT needs to consider how individual or more likely, a combination of different anti-malware applications can be used.
That might require investment in additional hardware and software.
Big Data is here to stay, and all indications are that it is never deleted and will only increase in size and range of analytic applications. Big Data will need Big Protection.