Blog

Science & Technology

Big Data – Understanding the Risks

Introduction

This article aims to give a high level overview of the risks businesses need to be aware of when using Big Data.

What is Big Data?

A popular definition of Big Data which can be found in the Gartner IT Glossary is known as the three “Vs” “high-volume, high-velocity and/or high-variety information assets”. Other additional “Vs” that have been recommended which include: veracity, meaning the uncertainty of data; and value due to its commercial and economic potential.

Big data is used to produce predictions by using a complex method of analytics to infer information from data sets from a variety of different sources (“Big Data Analytics”). The data can be collected from various sensors, such as (but without limitation to): internet clicks; GPS data from satellite devices; wearable devices; swipe cards; payment devices; health information and weather sensors.

What are the advantages of using Big Data?

Big Data has been used to improve transport, health and education services. For example, Transport for London uses a combination of data (inferred from sources such as ticket, camera, location and predication information) to plan its closures and diversions to ensure minimal disruption for travellers.

Private businesses already use Big Data, to offer customers more tailored advice, offers and related products with a view to obtaining and maintaining a market advantage. Benefits can be passed on to consumers. For instance, where insurance providers can get additional information they require through Big Data Analytics, the time spent by consumers providing the information required for insurance cover becomes shorter.

What are the risks?

The summary below provides a quick overview of some of the risks involved of using Big Data and the analytics often associated with it.

Legal compliance

Where the data contains personal data, a business needs to comply with data protection laws and the upcoming requirements of the General Data Protection Regulation. Personal data is any data which can identify an individual (e.g. name, location data, IP address). A failure to comply with data protection legislation could result in a serious data breach which can attract large fines and lead to reputational damage.

Regulatory burden and risks of breaching data protection laws can be mitigated by considering whether all of the data is necessary, whether a time limit for retention could be imposed or whether the personal data can be anonymised. Where personal data is used, business need appropriate practices and policies in place.

Discrimination
Big Data Analytics has been known to learn patterns through information that discriminate against people. For instance, a female doctor was locked out of a changing room because the automated security system had profiled her as a male as it had associated the title “Dr” with men. This is clearly an unwanted result and could have serious consequences, especially where automatic profiling is used in an employment context.

Accuracy

Big Data Analytics is predictive in nature and sometimes means that it draws inaccurate conclusions. If the information inputted is biased, the results are also likely to be biased. For example, The City of Boston provides a Street Bump app for smartphones. On a car journey, the app uses the phone’s accelerometer and GPS data to record movements due to problems with the roads, (e.g. potholes) and transmits the data to the council. Levels of smartphone ownership among different socioeconomic groups means more data may be collected from more affluent areas, rather than those with the worst roads.

Inaccurate data may even be provided deliberately. A study found that 60% of UK consumers have intentionally submitted inaccurate information when providing their personal details online in an attempt to keep their details private.

Intellectual Property

Third parties may have rights over the databases, software or algorithms used to analyse data sets. Business must make sure they have adequate licences to not infringe on another party’s intellectual property rights.

Cyber security

If data is valuable to one business, it is likely to be valuable to others. Businesses must make sure they have adequate security measures in place to protect the data they own and collect. Where personal data is involved, safeguards may need to be higher.

Competition law

As data is a valuable commodity, competition authorities may prioritise investigating how companies use Big Data and the analytics associated. For example, the German competition authority found that Facebook abused a dominant market position through the use of targeted advertising on consumers.

Other Risks to Consider

The list above is not exhaustive, and in addition, there may be sector specific risks of which businesses need to be aware of. 

Conclusion

The use of Big Data can provide advantageous results for businesses, research and the public sector alike. However, due consideration must be given to the risks associated with the retention and use of data, especially where personal data is involved.

 

These notes have been prepared for the purpose of articles only. They should not be regarded as a substitute for taking legal advice.