A year after 14 of the biggest names in the Hadoop ecosystem joined forces to try and harmonize the vast array of disparate technologies that make up the platform, the first version of their project plan has been released. The ODPi Runtime Specification lays out what the group sees as the key ingredients that a distribution should include to effectively address user demands. Read the Article Here
John Mertic, senior program manager for the ODPi, explained in a phone call how the main goal of the project is to bring together representatives from many different parts of the Hadoop world — not just the distributions themselves, but the makers of toolsets, the ISVs, and the customer solution providers — and remove both the duplication of effort and the divisions that have arisen between the various parties deploying Hadoop. Read the Article Here
Industry-wide effort to advance open standards for Apache Hadoop® attracts diverse representation from Big Data ecosystem, brings enterprise requirements and technical team to bear
NEW YORK (O’Reilly’s Strata Conference) and BUDAPEST (Apache: Big Data Conference), September 28, 2015 – ODPi, a nonprofit organization accelerating the delivery of Big Data solutions by powering a well-defined platform called ODPi Core, today announced new members, technical milestones, its formal governance structure and that it will be hosted at The Linux Foundation as a Collaborative Project.
The explosion of data and the requirements to store and process information has resulted in a variety of Big Data solutions. ODPi brings industry leaders together to accelerate the adoption of Apache Hadoop® and related Big Data technologies and make it easier to rapidly develop applications. This will be done through integration and standardization of a common reference platform that enables users to realize business results more quickly. ODPi will integrate a variety of upstream Apache projects, working across the Apache ecosystem to create a downstream reference platform on top of which new Big Data solutions can be built.
Membership investments in this effort have nearly doubled since ODPi was announced in February. Members to date represent a diverse group of Big Data solution providers and end users such as Altiscale, Ampool, Capgemini, CenturyLink, DataTorrent, EMC, GE, Hortonworks, IBM, Infosys, Linaro, NEC, Pivotal, PLDT, SAS Institute Inc, Splunk, Squid Solutions, SyncSort, Telstra, Teradata, Toshiba, UNIFi, VMware, WANdisco, Xiilab, zData and Zettaset.
“ODPi is a useful downstream project for the community to work on a common reference platform and set of technologies around Hadoop,” said Jim Zemlin, executive director at The Linux Foundation. “We’ve seen this model work with open source technologies experiencing rapid growth and know it can increase adoption and open up opportunities for innovation on top of an already strong Hadoop community.”
Technical milestones include the release of an initial ODPi core specification and reference implementation that simplifies upstream and downstream qualification efforts and has been created by developers from across the Big Data landscape. More than 35 maintainers from 25 companies are dedicated to this ongoing work to start. The planned ODPi Certification Program is also underway. The goal of ODPi Certification Programs will be to ensure consistency and compatibility across the Big Data ecosystem. To learn more about this technical progress, please visit: https://github.com/odpi or visit the ODPi.org website at http://ODPi.org
ODPi uses an open governance model that is led by a community of developers who will form a Technical Steering Committee (TSC) based on expertise and value of contribution. All members will have an equal vote on ODPi Core decisions, regardless of investment level, ensuring equality among all participants and an industry-wide consolidation of enterprise requirements. ODPi will also elect a Board of Directors responsible for the financial, legal and promotional aspects of ODPi.
ODPi will be featured at Strata+Hadoop World New York, Sept 29-October 1 where it will demonstrate the vision of one application running on many ODPI member distributions. An ODPi panel will be featured at Apache: Big Data in Budapest September 28-September 30, 2015
For more information about the ODPi, please visit: http://ODPi.org
ODPi is a Linux Foundation Collaborative Project, which are independently funded software projects that harness the power of collaborative development to fuel innovation across industries and ecosystems. By spreading the collaborative DNA of the largest collaborative software development project in history, The Linux Foundation provides the essential collaborative and organizational framework so project hosts can focus on innovation and results. For more information about Linux Foundation Collaborative Projects, please visit: http://collabprojects.linuxfoundation.org/
“The ODPi has made tremendous progress in a short amount of time. This rapid growth attests to the demand for standards in the rapidly evolving Hadoop ecosystem. Both customers and third party application providers can see greater value from Hadoop when they are confident that their developed solutions will run as broadly as possible,” said Raymie Stata, CEO of Altiscale, former CTO of Yahoo!, and core member of the ODPi Technical Working Group. “The Technical Working Group has been moving along speedily on Hadoop core, and we are already looking ahead to bring more projects into the standard specification.”
“With the widespread adoption and importance of the Hadoop ecosystem within the enterprise, the ODPi standard platform is a very important and timely effort, and Ampool is excited to be participating. With ODPi, it will be clear what standards and level of support are expected for platforms, configuration, security, and interoperability in Hadoop distributions. We are confident that the ODPi effort will provide a solid base for further growth of Hadoop as the foundation of next generation data infrastructure,” said Milind Bhandarkar, Founder & CEO, Ampool Inc.
“Our membership in ODPi demonstrates our commitment to spurring adoption and fostering support for the larger Hadoop ecosystem. As one of the earliest members to ODPi, we strongly believe in participating in a common framework to ensure that every enterprise has access to DataTorrent RTS for unified batch and streaming platform, no matter what Hadoop distribution,” said Phu Hoang, cofounder and CEO, DataTorrent.
“EMC is keenly interested in helping our customers build standardized environments for big data workloads. These environments are good for customers and good for the industry,” said Kelly Kanellakis, Senior Director, Business Operations, EMC Corporation. “ODPi builds exactly that through its efforts to create interchangeable Hadoop environments by working with customers and vendors in a collaborative way.”
“GE Software is committed to advancing the Hadoop ecosystem to support the industrial requirements of managing, processing and extracting insights from big data at scale,” said Vince Campisi, CIO, GE Software. “The creation of a common platform certification and governance process under the ODPi is enabling us to more effectively deliver industrial-strength apps to our customers to tackle their big data challenges with confidence. With the help of ODPi we can achieve this at a low cost, while enabling our customers to also benefit from the productivity gains that the Industrial Internet has to offer. It is a win-win for everyone.”
“At Hortonworks, we believe innovation happens not in isolation but in collaboration. We aim to speed Hadoop adoption through ecosystem interoperability rooted in open source so enterprise customers can reap the benefits of increased choice with more big data applications and solutions. As a founding ODPi member, we are pleased to see its significant strides toward these goals, all under an open and transparent governance model,” said Shaun Connolly, vice president of corporate strategy, Hortonworks.
“The power and appeal of open source innovation for technologies such as Hadoop is undeniable, especially when it comes to the growing volumes of data generated by social media, mobile devices, and machine-to-machine sensors,” said Beth Smith, General Manager, Analytics Platform, IBM Analytics. “In a recent report, IDC estimated that only 30 percent of clients have adopted Hadoop. Adoption is being constrained by complexity and a lack of standardization. IBM is fully committed, working with this community, to help drive speed-to-innovation for consistency and standardization in the development of smart business apps and accelerate the use of analytics across every business in a fundamental way.”
“Through its platinum sponsorship of the ODPi Infosys is working with industry leaders to promote and advance the state of Apache Hadoop® and other enterprise big data technologies. Infosys also wants to grow the adoption of big data technologies in the enterprise by making significant improvements in areas like development and deployment tools, performance and security and is contributing these enhancements back to the community. One of our first contributions to the ODPi is the ODPi reference deployer that our team has built,” said Navin Budhiraja, SVP, Head of Architecture and Technology at Infosys Limited. “Infosys Information Platform (IIP), our open source data analytics platform, supports the ODPi core and the extensive use of open source in IIP reflects the commitment of Infosys to ODPi and the open source community. In addition, multiple other strategic initiatives at Infosys, such as our industry solutions in Banking, Aeronautics and Retail, the Infosys Automation Platform (IAP), and the Industrial Internet Consortium (IIC) testbeds for predictive maintenance are powered by IIP.”
“ODPi accelerates the delivery of Big Data solutions by providing a well-defined platform called ODPi Core, enabling enterprises to build transformative, personalized applications with data at their core. Pivotal’s technology and software development expertise transforms good companies into great software companies. Our collaboration with ODPi will usher a new era of open source Big Data solutions central to an enterprise customer’s digital transformation journey, by arming them with the ability to use data to foster meaningful engagement with their customers,” said Gavin Sherry, Vice President and CTO, Data, Pivotal
Teradata”Teradata is committed to accelerating enterprise adoption of Hadoop. ODPi plays an important role by testing and certifying the Apache Hadoop core. ODPi makes implementations easier and enterprise class –as well as enabling more software tools to work with Hadoop. We have been impressed by the progress ODPi has made towards these goals in such a short amount of time,” said Justin Borgman, VP/GM, Teradata Center for Hadoop.
“UNIFi is excited to be a part of the ODPi. We believe establishing standards and best practices to this rapidly innovating / evolving ecosystem of technology components will be a great step to enabling enterprises to become more data driven. Focusing on delivering value to the business is the promise of this movement and this consortium is furthering that goal,” said Sean Keenan, cofounder and vice president of products at UNIFi.
“VMware products and services help our customers deliver a consistent environment for building, running and managing any application including big data workloads. ODPi’s work on common, open source infrastructure to advance and accelerate big data adoption aligns well with VMware’s goal,” said Mark Lohmeyer, Vice President Products, Cloud Platform BU, VMware. “VMware is looking forward to participating in this pan-industry effort and advancing Big Data technologies for everyone.”
“Xiilab provides services and software based on Big Data. Our goal is for our users to feel there is no limit to what their application can accomplish. We hope to bring Hadoop and open source to the hands of the community so that the innovation may broaden the achievements of many. Being the first member from South Korea, we wish to have a symbiotic relationship with the community that will help deliver an accessible service to users across the globe,” said Xiilab CEO and President, Woo Young Lee.
“As a Big Data solution provider, zData Inc. sees the inherent need for unification and collaborative supervision of this rapidly changing platform ecosystem. zData’s largest challenge in providing Hadoop Managed Services has been to successfully deliver a predictable level of service to our customers and partners while still keeping up with the latest community innovations from many disparate open source projects. The ODPi has the opportunity to become the unifying factor, pulling together resources across the entire eco-system, from vendors to customers, to focus on standardizing and unifying these Open Source technologies.”
“Security is still considered a major barrier to broader adoption of Hadoop in the enterprise. To address that problem, Zettaset is providing customers with a proven, commercial-grade, standards-compliant encryption solution which is performance-optimized for Big Data architectures encompassing Hadoop, NoSQL, and other databases while delivering the highest levels of data protection,” said CEO Jim Vogt of Zettaset. “Zettaset and its advanced big data security solutions are aligned with the ODPi for one simple reason. With a standardized Hadoop distribution built around ODPi, customers now have a much more consistent and predictable technology foundation supported by a wide choice of long-standing systems vendors as well as specialized application providers like Zettaset. Technology platform stability inherently reduces risk, giving customers greater confidence to deploy Hadoop as a mainstream solution in the enterprise.”
About The Linux Foundation The Linux Foundation is a nonprofit consortium dedicated to fostering the growth of Linux and collaborative software development. Founded in 2000, the organization sponsors the work of Linux creator Linus Torvalds and promotes, protects and advances the Linux operating system and collaborative software development by marshaling the resources of its members and the open source community. The Linux Foundation provides a neutral forum for collaboration and education by hosting Collaborative Projects, Linux conferences including LinuxCon, and generating original research and content that advances the understanding of Linux and collaborative software development. More information can be found at www.linuxfoundation.org.
The Linux Foundation and Linux Standard Base are trademarks of The Linux Foundation. Linux is a trademark of Linus Torvalds.
The Linux Foundation
Today, fifteen industry leaders in the big data space announced the intent to create a new industry initiative, identified as the Open Data Platform (“ODP”), to promote open source-based big data technologies and standards for enterprises building data-driven applications (opendataplatform.org). The initial group of member companies include Platinum members GE, Hortonworks, IBM, Infosys, Pivotal, SAS, a large international telecommunications firm, and Gold members AltiScale, Capgemini, CenturyLink, EMC, Splunk, Verizon Enterprise Solutions, Teradata, and VMware. Born from the playbook Pivotal used just a year ago to leverage open source and open collaboration to accelerate Cloud Foundry into becoming the biggest open source success in recent years, Open Data Platform promises to do the same for the Apache Hadoop® ecosystem and big data, and do it quickly.
Everything Starts With Open Source
Last year, Pivotal scribed its open source manifesto, detailing why open source is pivotal to the success of any technology. From recruiting top talent to accelerating adoption, feedback and innovation, open source has long since proven that no proprietary technology can compete with a viable open source alternative. However, while single technologies have thrived with open source, ecosystems naturally lag in development without an organizing force. By openly joining forces with the leading vendors, service providers and users of Apache Hadoop® to focus specifically on the needs of the enterprise, the Open Data Platform aims to reduce fragmentation and accelerate developments and innovation across the Hadoop ecosystem.
Open Collaboration: A Rising Tide That Lifts All Boats
A thriving ecosystem is the key for real viability of any technology. With lots of eyes on the prize, the technology becomes more stable, offers more capabilities, and importantly, supports greater interoperability across technologies, making it easier to adopt and use, in a shorter amount of time. By creating a formal organization, the Open Data Platform will act as a forcing function to accelerate the maturation of an ecosystem around Big Data. Of course, the caliber of the members of the organization are also very important. The members have to have relevant expertise and investment in the area. They also should be looking at the challenges from a variety of angles, balancing the views of consumers of the technologies with providers. This is why, when we set out to recruit for the Cloud Foundry Foundation, we recruited a variety of tech-savvy companies, from software giants like IBM, EMC and SAP to service providers like Savvis, Rackspace and NTT and industry leading consumers of PaaS like Monsanto, eBay, and BNY Mellon. For the Open Data Platform, the first wave of members combines heavy-weight brands across Hadoop software providers including EMC, Hortonworks, IBM, Pivotal, Teradata, Splunk and VMware; service providers like AltiScale, CenturyLink, and Verizon Enterprise Solutions; advanced ISV’s like CapGemini, Infosys, and SAS; and, finally, leading Hadoop consumers like General Electric and another large international telco. This is just the first wave, and as an open foundation, we expect to expand the ranks quickly. Once working under the foundations framework, each of these companies will pool resources and efforts in cooperation, eliminating redundancies and establishing a clear and agreed way for us all to work. Simply put, this creates operational efficiencies across an entire ecosystem. More investment will flow into the standardized open source, and more innovation and interoperability will flow out of the vendors in the ecosystem, accelerating benefits for all.
First Goals for the Open Data Platform Initiative
Translating this into real tactics and benefits, look for significant progress on 3 milestones toward a successful ecosystem in the Open Data Platform’s first year:
- An industry standard and open data management core. Initially focused on Apache Hadoop®, the Open Data Platform will develop and promote a set of open, enterprise focused Hadoop® standards and technologies. This translates to immediate benefits that will increase stability, capabilities, and compatibility among Hadoop® distributions.
- Certifying a common reference core. The Open Data Platform will deliver a certified, packaged, and tested reference core–giving the industry a coveted “test once, use everywhere” solution. With the entire industry enabled to create big data offerings using this reference and consistent implementation, software applications will be more likely to run on any distribution based on the Open Data Platform’s Hadoop® core, reducing risk and vendor lock-in while focusing vendor resources toward more innovation.
- More support and contributions for the Apache Software Foundation. The Open Data Platform is expected to be complementary and beneficial to the efforts and stewardship of the Apache Software Foundation (ASF), using the existing ASF processes to contribute code, perform testing, integration, infrastructure support as well as increase participation in events and collaboration with the developer community.
The Future Is Near
Today’s announcement is about an organization that will be created in the near future. However, progress is not waiting for the Open Data Platform to stand itself up. It assembles many partners who are already working together on big data initiatives. GE helped get Pivotal started specifically to tackle modern challenges of combining big data and the Internet of Things (IoT), with results stacking up to save trillions in the next few years. Hortonworks and Pivotal announced today that they will be combining efforts to support Hadoop distributions and partner on data lake technologies. Real code contributions are also prepared, with Pivotal open sourcing our SQL on Hadoop engine called HAWQ, allowing it to run across any distribution of Hadoop based on the Open Data Platform Core. If the efforts around the Cloud Foundry Foundation are any indicator, announcing the Foundation’s intent to form last February,standing the Foundation up in November, and posting record-breaking first-year open source sales by January of this year, everyone should expect the Open Data Platform to herald in big advances for big data sooner rather than later. Related Reading:
- Press Release: Pivotal and Hortonworks Join Forces On Apache Hadoop
- Press Release: Technology Leaders Unite Around Open Data Platform To Increase Enterprise Adoption of Hadoop and Big Data
- Press Release: Pivotal Introduces First Open Source, Enterprise Grade Big Data Product Suite
- Press Release: Pivotal Announces Big Data Global Roadshow To Bring Agile Data Expertise Directly To Customers
- Blog: Why The Open Data Platform Initiative Is Such A Big Deal For Big Data
- Hortonwork’s Shaun Connolly on the Pivotal Blog: Pivotal and Hortonworks Join Forces on Apache Hadoop
- Pivotal’s Gavin Sherry on the Hortonworks Blog: Pivotal Aligns With Hortonworks To Give Big Data A Big Boost
Editor’s Note: Apache, Apache Hadoop, Hadoop, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.
It was just a little over 100 days ago that 15 industry leaders in the Big Data space announced the formation of the Open Data Platform (ODP) initiative. We’d like to let you know what has been going on in that time, to bring you a preview of what you can expect in the next few months and let you know how you can become involved.
What is the Open Data Platform Initiative? The Open Data Platform Initiative (ODP) is an enterprise-focused shared industry effort focused on simplifying adoption and promoting the use and advancing the state of Apache Hadoop® and Big Data technologies for the enterprise. It is a non-profit organization being created by folks that help to create: Apache, Eclipse, Linux, OpenStack, OpenDaylight, Open Networking Foundation, OSGI, WSI (Web Services Interoperability), UDDI , OASIS, Cloud Foundry Foundation and many others. The organization relies on the governance of the Apache Software Foundation community to innovate and deliver the Apache project technologies included in the ODP core while using a ‘one member one vote’ philosophy where every member decides what’s on the roadmap. Over the next few weeks, we will be posting a number of blogs to describe in more detail how the organization is governed and how everyone can participate. What is the Core? The ODP Core provides a common set of open source technologies that currently includes: Apache Hadoop® (inclusive of HDFS, YARN, and MapReduce) and Apache® Ambari. ODP relies on the governance of the Apache Software Foundation community to innovate and deliver the Apache project technologies included in the ODP core. Once the ODP members and processes are well established, the scope of the ODP Core will expand to include other open source projects. Benefits of the ODP Core The ODP core is a set of open source Hadoop technologies designed to provide a standardized core that big data solution providers software and hardware developers can use to deliver compatible solutions rooted in open source that unlock customer choice. By delivering on a vision of “verify once, run anywhere”, everyone benefits:
- For Apache Hadoop® technology vendors, reduced R&D costs that come from a shared qualification effort
- For Big Data application solution providers, reduced R&D costs that come from more predictable and better qualified releases
- Improved interoperability within the platform and simplified integration with existing systems in support of a broad set of use cases
- Less friction and confusion for Enterprise customers and vendors
- Ability to redirect resources towards higher value efforts
100 Day Progress Report
In the 100 days since the announcement, we’ve made some great progress: Four Platforms Shipping At Hadoop Summit in Brussels in April, we announcedthe availability of four Hadoop platforms all based on a vision of a common ODP core: Infosys Information Platform, IBM Open Platform, Hortonworks Data Platform and Pivotal HD. The commercial delivery of ODP based distributions across multiple industry leading vendors immediately after the launch of the initiative demonstrates the momentum behind ODP to accelerate the delivery of compatible Hadoop distributions and the simplification it brings to the ecosystem using that as an industry standard. New Members and New Participation Levels In addition to revealing that Telstra is one of the founding Platinum members of the ODP, we’ve added new nine new members, including BMC, DataTorrent, PLDT, Squid Solutions, Syncsort, Unifi, zData, Zettaset. We welcome these new members and are looking forward to their participation and their announcements. We also announced new membership level to provide an easy entrée for any company to participate in the ODP. The Silver level of membership allows companies to have a direct voice into the future of big data and contribute people, tests, and code to accelerate executing on the vision. Community Collaboration at the Bug Bash ODP Member Alitscale lead the efforts on a Hadoop Community Bug Bash. This unique event for the Apache Hadoop community, along with co-sponsors Hortonworks, Huawei, Infosys, and Pivotal, saw over 150 participants from eight countries and nine time zones, to strengthen Hadoop and honor the work of the community by reviewing and resolving software patches. Read more about the Bug Bash, where 186 issues were resolved either with closure or patches committed to code. Nice job everyone! You can participate in upcoming bug bashes, so stay tuned. Technical Working Group and the ASF Senior engineers and architects from the ODP member companies have come together as a Technical Working Group (TWG). The goal of the TWG is to jump-start the work required to produce ODP core deliverables and to seed the technical community overseeing the future evolution of the ODP core. Delivering on the promise of “verify once and run anywhere” TWG is building h certification guidelines for “compatibility” (for software running on top of ODP) and “compliance” (for ODP platforms). We have scheduled a second TWG face-to-face meeting at Hadoop Summit and where committers, PMC and ASF members will be meeting to continue these discussions.
Many of the member companies will be at Hadoop Summit in San Jose. If you haven’t already registered and want to attend, use this code for hdp-partner15 for a 20% discount – but don’t wait, the discount expires on Saturday, June 6! While you’re at Hadoop Summit, you can attend the IBM Meet Upand hear more about the ODP.Stay tuned to this blog as well – we’ll use this as a platform to inform you of new developments and provide you insight on how the ODP works. Want to know more about the ODP, here are a few reference documents