Monday, March 23, 2015

Big Data Metro Map

I have a small mentoring team whom I meet once in two weeks. As part of the engagement I asked them to come up with a big data metro map. I would like to thank Vineeth Roy, Ajesh Kumar, & Toney Thomas for coming up with the version 1 of the big data metro map. A double thanks to Vineeth Roy for drawing the metro map using the free tool .

We initially collated all the required data and started classifying it. As usual confusions around classification happens due to different view points. We decided upon the below classification as the first starting point. Then Vineeth was assigned to come up with the first version of the diagram. We had to redraw it with couple of open source tools available before settling on the version 1. This is the third iteration of the diagram which we call as version 1. It is by no means a perfect one. It needs further refinement so that any new person who is trying to learn big data uses this as a blue print. It also depicts the complexity involved in mastering this space due to vastness of the field. Testing is more complicated in this space due to the distributed nature of computation.

There are some fundementals that everyone needs to know like HDFS, MapReduce, Hive, etc. You cannot focus on all what is listed below. But you can choose some streams which helps you go deeper based on the role you play and the customer you work with. If you are trying to be an architect in this space, you need to have shallow knowledge of all what is listed below.

Please provide your feedbacks so that we can improve it going forward. If you have some good tools or better ideas to represent it, please redraw it and repost it for others benefit.

Image description not specified.

No comments:

Post a Comment