ODPi Meetup Recap: “War Stories of Making Software Work with Hadoop”

By August 2, 2016Blog

Hadoop Summit is notorious for bringing together everyone who’s anyone in the in the Big Data world – and this year’s event, welcoming more than 4,000 attendees, was no different.

 

Not only was ODPi able to announce that five Apache™ Hadoop® distributions are officially ODPi Runtime Compliant, but we also hosted a meetup that centered on “War Stories of Making Software Work with Hadoop.”

 

Successfully migrating big data software to interoperate with one or more Apache™ Hadoop® releases requires unique engineering approaches and streamlined innovation. Our meetup discussed the importance and benefits of certifying compatibility between multiple Hadoop distributions. Those who have navigated this space for years without any true standardization shared their war stories.  

Attendees also heard from ODPi members hailing from big data software vendors and ISVs. The War Stories panel featured insights from Scott Gray, chief architect of IBM’s Open Platform for Apache Hadoop; Vineet Goel, principal product manager of Pivotal HDB & Hadoop at Pivotal; Paul Kent, VP of big data initiatives at SAS; and Smiti Sharma, principal engineer of big data and emerging technologies for EMC. These members have each ported their software to work with one or more Hadoop distributions.

They discussed technical challenges they overcame and why they believe ODPi will help simplify this for both end users and ISVs in the future.

After explaining to the room how their companies are committed to both big data innovation, and how their numerous technologies aid end users, Gray, Goel, Kent, and Sharma then covered off on cross-organizational compatibility within the Hadoop space.

 

John Mertic’s first question to the panel, “Before the concept of what ODPi is meant to deliver, what were the chief challenges you were running into?” (can be found at the 28:50 mark).

When diving into this question – most of which centered on their experience and the difficulties of supporting multiple, disjointed distributions – the panelists made some insightful statements.

Gray of IBM set the stage for these pain points, noting, “Hadoop evolves at an incredible pace and there’s this never-ending tension between what the customers want… and distros [being] pressed to keep up with this evolution, and we have all these products trying to chase the distribution… It makes it incredibly, insanely expensive… It really was in our best interest to try to put a little sanity into the landscape.”

 

Goel applauded ODPi’s baseline specifications and explained Pivotal’s arduous journey of taking on a new distribution (around the 34:00 mark). Mertic commented: “I like how you said, ‘If we had the money back from supporting all these distros, imagine the innovation we could have…’ I think that’s a really powerful statement.”

After kicking off an interactive Q&A with the engaged audience, an audience member then asked for examples of the value proposition for the end users for engaging with companies part of ODPi (starting after the 42:00 mark).

Sharma addressed this question, noting her experience in pre-sales, saying “You could benefit from being on an ODPi-compliant platform… if you want to have your application portable from a Hadoop as an OS, it’s possibile through being part of ODPi.”

 

“In the early days of Hadoop, you really did have to grow your own in-house talent,” said Kent. “but we’re entering the mature part of the lifecycle curve where there’s lots of customers that just want to pick it up and use it. They don't really want to get into all these nuances. So the value of something like ODPi… will inevitably make a standardized path, where people can say ‘If you don't go out of these lines, you’re pretty safe.’”

Catch a full recording of our meetup, centered on how ODPi fits into the Hadoop and Big Data ecosystem, here – and don’t forget to subscribe to our YouTube channel!