top of page
Writer's pictureVj

Series 1 Part 5 - Big Data; The V's Volume, Velocity, Variety..

The Volume, Velocity, Variety, Veracity, and Volume considered as the main characteristics of Big Data.


Note: If you did not get to read the previous post in this series click the link : Series 1 Part 4 https://www.abigdatablog.com/post/big-data-data-sources-and-types



The introduction to V's in Big Data:
The V’s are the characteristics of Big Data. The data keeps changing with time; when the data keeps evolving. The industry identified multiple V’s, and the following are some of them.

  • Volume: The volume of the data is enormous.

  • Velocity: Data is collected and accumulated at a very high speed.

  • Variety: The data collected from multiple sources, and the data is in different formats.

  • Veracity: There are inconsistencies and uncertainty due to the way the data is collected.

  • Value: Identifying the value of the data in the massive pile of Big data is challenging.

  • Validity: The correctness of data and validity of the sources

  • Variability: The data keeps changing with time; information collected is dynamic by nature.

  • Volatility: The tendency to change the data in time is very high.

  • Vulnerability: The data is vulnerable to breach or attacks.

  • Visualization: Collecting is one thing; visualizing meaningful usage of data is challenging.


The main #V’s are as follows:


Volume refers to the amount of data and growth. The size of data generated by humans, machines, and their interactions on social sites is massive. Researchers have predicted that 40 Zetabytes (40,000 Exabytes).


Velocity defined as the pace at which different sources generate the data every day. The flow of data is massive and continuous. There are 1.03 billion Daily Active Users of Facebook. If you can handle the velocity, you will be able to generate insights and make decisions based on real-time data.


As many sources are contributing to Big Data, the type of data they are generating is different. It can be structured, semi-structured, or unstructured. Hence, there is a variety of data that is getting c every day. Earlier, we used to get the data from excel and database; now, the data are coming in the form of images, audios, videos, sensor data, etc. Hence, this variety of unstructured data creates problems in capturing, storage, data mining, and analyzing the data.


Veracity refers to uncertainty, inconsistency, and incompleteness in the data. This inconsistency and incompleteness is Veracity. Data available can sometimes get messy and may be challenging to trust. With many forms of big data, quality and accuracy are difficult to control, like Twitter posts with hashtags, abbreviations, typos, and everyday speech. The volume is often the reason behind the lack of quality and accuracy in the data.


Identified in a survey that 27% of respondents were unsure of how much of their data was inaccurate. Poor data quality costs the US economy around $3.1 trillion a year.


Having access to big data is an excellent opportunity to tap into future business opportunities, but unless we can turn data into profit, it is useless. How much benefit is the drawn data adding to the value to the organizations who are analyzing big data? Any #ROI (Return On Investment)?


Conclusion:

Big data provides a reliable system and architecture to provide value to the company. But, the vulnerability is high due to the nature and the characteristics of the data collected in massive quantities.

176 views0 comments

Comments


bottom of page