Big Data. A buzzword? A trend? A fad? It’s been around for a while now and if I had a dollar for every time I hear or read the phrase I’d be happily cradling a beer on my deck, suspended over the Croatian cliff line, while my kids played with an Oculus a safe distance away from my Tesla.
The topic is a big deal and clearly has many implications for those that hold, use and manipulate big data. What’s changed is the fact that the data itself is no longer the focus – Big Data is now the norm. Data sets are not going to get smaller any time soon and for what it’s worth, even Gartner has stated that ‘big data is no longer a topic unto itself’.
Now the focus is squarely on how to manage and derive value from the data – how to properly use analytics on huge sets of data to draw golden nuggets of information. As a result, the tools used to do this are now where it’s at.
And naturally, this tool set is enormous and continues to grow each year, with Hadoop not necessarily the first name that comes to mind.
When choosing tools sets, it’s important to consider various elements, for example – is real-time processing required? What are your availability requirements? How will users access the data? What data model are you using? What do you ultimately want from your data? Are there existing data warehouses to integrate into your solution?
Once these elements are established it’s important to remember that more choice is not necessarily better, as it’s possible that potential trade-offs between performance, usability and algorithm execution may need consideration.
The infographic below is a brilliant overview of the Big Data Tool landscape.
How many can you see that you‘ve used? How many have you never heard of before?