Governance Principles and Structure for the ODP

By Blog

Want to know more about what is going on with the Open Data Platform (ODP)? This post is the first in a series to offer that information. In it, we recap the ODP’s benefits, update on the collaboration with the Apache Software Foundation®, cover the information that is coming soon, and explain the overarching guiding principles.

ODP continues to make great strides since its introduction last February. Formed to simplify engineering and adoption as well as promote and advance the state of Apache Hadoop® and big data technologies for the enterprise, the ODP has grown to include dozens of member companies from throughout the industry. We have been laying the foundation for open governance, and this is the first in a series of blogs that will recap the progress and share the ODP’s guiding principles. Further blogs in the series will explore the governance of the ODP, the types of member companies and their roles, the processes for developer interactions and releases, and the work done by our board of directors. Summing up the Key Benefits—Why Become an ODP Member? The ODP is an initiative started by corporations, but decidedly for the broader Apache Hadoop community. While others have explained the benefits and goals in other articles, it can be simple summarized in five simple statements.

  1. Build a better end user experience. With the number of Hadoop ecosystem projects growing almost daily, the community needs to explicitly work on reducing friction, inefficiencies, and confusion as enterprises adopt Apache Hadoop platforms.
  2. Its open to all. With a very low hurdle for ANY developer or company to participate, it is feasible for any person or organization to get involved and have an impact.
  3. Eliminate R&D efficiencies for suppliers. With a shared qualification and testing toolset, it is easier for suppliers to integrate with a broader set of Hadoop projects.
  4. More value-added functionality and services, delivered more quickly. With streamlined R&D, developers can focus more effort on improving the Hadoop ecosystem with new capabilities, better scalability, tighter security, and greater manageability.
  5. Organized support for the ASF. This entire effort works in a supporting role to the ASF and promotes innovation and development of upstream projects  Industry collaboration will help “a 1000 flowers to bloom” within the Apache Software Foundation.

Progress—What is Already Here and What is Coming Soon? Building an organization like the ODP doesn’t happen overnight. We started with a set of guiding principles to inform the bylaws and operations. Specifically we stated our goals were to deliver:

  • Industrialization of Big Data – The structure and processes of the ODP are intended to optimally support our efforts to provide a stable, predictable and enterprise-ready ODP Core.
  • One member, one vote in ODP core content decisions – In keeping with the spirit of the open source movement that spawned Hadoop, ODP Core content decisions are made by all Members.
  • Complementary to the Apache Software Foundation (ASF)  – The ODP relies on the ASF to innovate and deliver the Apache project technologies included in the ODP core. As such, all ODP Apache-related planning and development will be done in the ASF.
  • Transparent technical processes – As part of our commitment to open source, the technical work of the ODP will be done through open collaboration across the industry.
  • Equal opportunity to participate in development efforts – We welcome ideas from every part of the industry. Anyone, regardless of ODP membership, is invited to participate in our development work.
  • No surprises – The spirit of openness also extends to our planning processes and project status, which are open to all.

Soon, much more detailed information will be available across all ODP-related elements—the business entity, board of directors, bylaws, operating procedures, membership levels, community roles, processes, testing, certifications, governance, and the ODP core release definition itself, determined and managed by the community members.   Learn More:

Why The Open Data Platform Is Such A Big Deal For Big Data

By Blog

Today, fifteen industry leaders in the big data space announced the intent to create a new industry initiative, identified as the Open Data Platform (“ODP”), to promote open source-based big data technologies and standards for enterprises building data-driven applications ( The initial group of member companies include Platinum members GE, Hortonworks, IBM, Infosys, Pivotal, SAS, a large international telecommunications firm, and Gold members AltiScale, Capgemini, CenturyLink, EMC, Splunk, Verizon Enterprise Solutions, Teradata, and VMware. Born from the playbook Pivotal used just a year ago to leverage open source and open collaboration to accelerate Cloud Foundry into becoming the biggest open source success in recent years, Open Data Platform promises to do the same for the Apache Hadoop® ecosystem and big data, and do it quickly.

Everything Starts With Open Source

Last year, Pivotal scribed its open source manifesto, detailing why open source is pivotal to the success of any technology. From recruiting top talent to accelerating adoption, feedback and innovation, open source has long since proven that no proprietary technology can compete with a viable open source alternative. However, while single technologies have thrived with open source, ecosystems naturally lag in development without an organizing force. By openly joining forces with the leading vendors, service providers and users of Apache Hadoop® to focus specifically on the needs of the enterprise, the Open Data Platform aims to reduce fragmentation and accelerate developments and innovation across the Hadoop ecosystem.

Open Collaboration: A Rising Tide That Lifts All Boats

A thriving ecosystem is the key for real viability of any technology. With lots of eyes on the prize, the technology becomes more stable, offers more capabilities, and importantly, supports greater interoperability across technologies, making it easier to adopt and use, in a shorter amount of time. By creating a formal organization, the Open Data Platform will act as a forcing function to accelerate the maturation of an ecosystem around Big Data. Of course, the caliber of the members of the organization are also very important. The members have to have relevant expertise and investment in the area. They also should be looking at the challenges from a variety of angles, balancing the views of consumers of the technologies with providers.  This is why, when we set out to recruit for the Cloud Foundry Foundation, we recruited a variety of tech-savvy companies, from software giants like IBM, EMC and SAP to service providers like Savvis, Rackspace and NTT and industry leading consumers of PaaS like Monsanto, eBay, and BNY Mellon. For the Open Data Platform, the first wave of members combines heavy-weight brands across Hadoop software providers including EMC, Hortonworks, IBM, Pivotal, Teradata, Splunk and VMware; service providers like AltiScale, CenturyLink, and Verizon Enterprise Solutions; advanced ISV’s like CapGemini, Infosys, and SAS; and, finally, leading Hadoop consumers like General Electric and another large international telco. This is just the first wave, and as an open foundation, we expect to expand the ranks quickly. Once working under the foundations framework, each of these companies will pool resources and efforts in cooperation, eliminating redundancies and establishing a clear and agreed way for us all to work. Simply put, this creates operational efficiencies across an entire ecosystem. More investment will flow into the standardized open source, and more innovation and interoperability will flow out of the vendors in the ecosystem, accelerating benefits for all.

First Goals for the Open Data Platform Initiative

Translating this into real tactics and benefits, look for significant progress on 3 milestones toward a successful ecosystem in the Open Data Platform’s first year:

  • An industry standard and open data management core. Initially focused on Apache Hadoop®, the Open Data Platform will develop and promote a set of open, enterprise focused Hadoop® standards and technologies. This translates to immediate benefits that will increase stability, capabilities, and compatibility among Hadoop® distributions.
  • Certifying a common reference core.  The Open Data Platform will deliver a certified, packaged, and tested reference core–giving the industry a coveted “test once, use everywhere” solution. With the entire industry enabled to create big data offerings using this reference and consistent implementation, software applications will be more likely to run on any distribution based on the Open Data Platform’s Hadoop® core, reducing risk and vendor lock-in while focusing vendor resources toward more innovation.
  • More support and contributions for the Apache Software Foundation. The Open Data Platform  is expected to be complementary and beneficial to the efforts and stewardship of the Apache Software Foundation (ASF), using the existing ASF processes to contribute code, perform testing, integration, infrastructure support as well as increase participation in events and collaboration with the developer community.

The Future Is Near

Today’s announcement is about an organization that will be created in the near future. However, progress is not waiting for the Open Data Platform to stand itself up. It assembles many partners who are already working together on big data initiatives. GE helped get Pivotal started specifically to tackle modern challenges of combining big data and the Internet of Things (IoT), with results stacking up to save trillions in the next few years. Hortonworks and Pivotal announced today that they will be combining efforts to support Hadoop distributions and partner on data lake technologies. Real code contributions are also prepared, with Pivotal open sourcing our SQL on Hadoop engine called HAWQ, allowing it to run across any distribution of Hadoop based on the Open Data Platform Core. If the efforts around the Cloud Foundry Foundation are any indicator, announcing the Foundation’s intent to form last February,standing the Foundation up in November, and posting record-breaking first-year open source sales by January of this year, everyone should expect the Open Data Platform to herald in big advances for big data sooner rather than later. Related Reading:

Editor’s Note: Apache, Apache Hadoop, Hadoop, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

The 100 Day Progress Report on the ODP

By Blog

It was just a little over 100 days ago that 15 industry leaders in the Big Data space announced the formation of the Open Data Platform (ODP) initiative. We’d like to let you know what has been going on in that time, to bring you a preview of what you can expect in the next few months and let you know how you can become involved.

Some Background

What is the Open Data Platform Initiative? The Open Data Platform Initiative (ODP) is an enterprise-focused shared industry effort focused on simplifying adoption and promoting the use and advancing the state of Apache Hadoop® and Big Data technologies for the enterprise. It is a non-profit organization being created by folks that help to create:  Apache, Eclipse, Linux, OpenStack, OpenDaylight, Open Networking Foundation, OSGI, WSI (Web Services Interoperability), UDDI , OASIS, Cloud Foundry Foundation and many others. The organization relies on the governance of the Apache Software Foundation community to innovate and deliver the Apache project technologies included in the ODP core while using a ‘one member one vote’ philosophy where every member decides what’s on the roadmap. Over the next few weeks, we will be posting a number of blogs to describe in more detail how the organization is governed and how everyone can participate. What is the Core? The ODP Core provides a common set of open source technologies that currently includes: Apache Hadoop® (inclusive of HDFS, YARN, and MapReduce) and Apache® Ambari. ODP relies on the governance of the Apache Software Foundation community to innovate and deliver the Apache project technologies included in the ODP core. Once the ODP members and processes are well established, the scope of the ODP Core will expand to include other open source projects. Benefits of the ODP Core The ODP core is a set of open source Hadoop technologies designed to provide a standardized core that big data solution providers software and hardware developers can use to deliver compatible solutions rooted in open source that unlock customer choice. By delivering on a vision of “verify once, run anywhere”, everyone benefits:

  • For Apache Hadoop® technology vendors, reduced R&D costs that come from a shared qualification effort
  • For Big Data application solution providers, reduced R&D costs that come from more predictable and better qualified releases
  • Improved interoperability within the platform and simplified integration with existing systems in support of a broad set of use cases
  • Less friction and confusion for Enterprise customers and vendors
  • Ability to redirect resources towards higher value efforts

100 Day Progress Report

In the 100 days since the announcement, we’ve made some great progress: Four Platforms Shipping At Hadoop Summit in Brussels in April, we announcedthe availability of four Hadoop platforms all based on a vision of a common ODP core: Infosys Information PlatformIBM Open Platform, Hortonworks Data Platform and Pivotal HD. The commercial delivery of ODP based distributions across multiple industry leading vendors immediately after the launch of the initiative demonstrates the momentum behind ODP to accelerate the delivery of compatible Hadoop distributions and the simplification it brings to the ecosystem using that as an industry standard. New Members and New Participation Levels In addition to revealing that Telstra is one of the founding Platinum members of the ODP, we’ve added new nine new members, including BMC, DataTorrent, PLDTSquid SolutionsSyncsort, UnifizData, Zettaset. We welcome these new members and are looking forward to their participation and their announcements. We also announced new membership level to provide an easy entrée for any company to participate in the ODP. The Silver level of membership allows companies to have a direct voice into the future of big data and contribute people, tests, and code to accelerate executing on the vision. Community Collaboration at the Bug Bash ODP Member Alitscale lead the efforts on a Hadoop Community Bug Bash. This unique event for the Apache Hadoop community, along with co-sponsors Hortonworks, Huawei, Infosys, and Pivotal, saw over 150 participants from eight countries and nine time zones, to strengthen Hadoop and honor the work of the community by reviewing and resolving software patches. Read more about the Bug Bash, where 186 issues were resolved either with closure or patches committed to code. Nice job everyone!  You can participate in upcoming bug bashes, so stay tuned. Technical Working Group and the ASF Senior engineers and architects from the ODP member companies have come together as a Technical Working Group (TWG). The goal of the TWG is to jump-start the work required to produce ODP core deliverables and to seed the technical community overseeing the future evolution of the ODP core. Delivering on the promise of “verify once and run anywhere” TWG is building h certification guidelines for “compatibility” (for software running on top of ODP) and “compliance” (for ODP platforms). We have scheduled a second TWG face-to-face meeting at Hadoop Summit and where committers, PMC and ASF members will be meeting to continue these discussions.

What’s Next?

Many of the member companies will be at Hadoop Summit in San Jose. If you haven’t already registered and want to attend, use this code for hdp-partner15 for a 20% discount – but don’t wait, the discount expires on Saturday, June 6! While you’re at Hadoop Summit, you can attend the IBM Meet Upand hear more about the ODP.Stay tuned to this blog as well – we’ll use this as a platform to inform you of new developments and provide you insight on how the ODP works. Want to know more about the ODP, here are a few reference documents

Stay Informed

Sign up for our Newsletter to receive the latest ODPi news and updates.

Social Media Auto Publish Powered By :