The success that ODPi has achieved as a nonprofit organization committed to simplification and standardization of the big data ecosystem is driven by the dedication of our member organizations and individuals. The ODPi Member Spotlight series interviews key ODPi contributors for a conversation exploring why they participate in ODPi, seeking to learn more about the individuals whose efforts are accelerating the development of today’s Big Data ecosystem, standards and solutions.
We recently spoke with Ferd Scheepers, Chief Information Architect for ING, to discuss his involvement with ODPi. In his role as ING’s global Chief Information Architect, Ferd has driven ING’s journey to becoming a data driven company for the last 5 years, defining ING’s Data Lake architecture for information management. He is championing the Apache Atlas and ODPi open metadata initiatives, and took time to share his vision, ideas and what motivates his contributions to ODPi–along with insight on how ING benefits from being an active member.
Tell me about your job position and what you are responsible for at ING?
For the last five years, I have been working as the global Chief Information Architect of ING. In this role, I am responsible for creating the Information Architecture for ING, which is becoming more and more important as we pursue the ambition of becoming a true data-driven organisation. We have created the ING data lake architecture, which is the main vehicle for ING to implement a fully metadata-driven data landscape, where all the data in the organisation is known. By known we mean not only where the data is, but also the data quality, the meaning, the owner of the data, and the full lineage from where the data comes to life, to any place the data is consumed, either by ING employees or by external parties like our regulators.
What is your involvement with ODPi? Tell us about the role you’ve played, your contributions, goals, and interests.
We got involved with ODPi in early 2017. At that time we had started together with IBM and Hortonworks to drive an Open Metadata initiative to define a set of open metadata standards, and build both a reference implementation for an Open Metadata compliant Metadata repository and the Open Metadata Highway. The Open Metadata Highway is a set of (Open Metadata Repository) Services that let different metadata repositories talk to each other in order to exchange metadata. On top of OMRS there is a set of (Open Metadata Access) Services, that enable dedicated applications or UIs specific for different personas in the organisation to consume services from the entire metadata landscape.
ODPi as an existing vendor-neutral organisation came in the picture as the most logical home for this open standard. Apache Atlas was chosen for the reference implementation for an Open Metadata compliant metadata repository, and the Open Metadata Highway was developed as the Egeria project within ODPi.
Why does ING see value in this work that ODPi is providing a vendor-neutral home for?
When ING got involved in driving this Open Metadata initiative, we knew that making such an initiative succeed requires several things. A willingness of several vendors to join together to make it a success. At least one company (preferably more) that represents the consumers of these vendor solutions, to explain the need for such a standard from a consumer perspective. And an open, vendor-neutral and respected community to be a home for the standard.
IBM and Hortonworks were involved from day one representing the vendors. ING took on the role to be the catalyst to bring them together. Not just as a voice of the customer, we decided to sit in the driving seat and have a full team contribute to developing this open standard. ODPi already being a very active group in steering the standardisation around Hadoop distributions seemed the logical choice for a home for the work that we were doing. Both because ODPi already had most of the facilities we needed, and because many of the vendors we wanted to join in this initiative were already a member.
ING also became a full member of ODPi in 2017 to support the valuable work ODPi is doing. We very much value the platform ODPi offers for developing the open standard, but more importantly, we value the community of vendors it brings us, and the exposure we get from ODPi to get the open standard known within a bigger community of both vendors and consumers.
What benefit has ING recognized from its membership with ODPi? What value do you expect to see from your participation?
Our participation in ODPi has already given us the platform to develop the ODPi Egeria open metadata standard. A full team from ING has been actively building the standard on the ODPi infrastructure. As a member, we also get to co-steer the direction of the open metadata initiative, and we benefit from the marketing initiatives from ODPi.
Through the community, we have now also involved SAS in the open metadata initiative, and we are talking to others. We expect ODPi to help us get this initiative known even more, both within the vendor community and with the consumer community.
Once the standard is mature, we see a role for ODPi in validating compliance to the standard, by delivering a test suite. ODPi will also deliver a set of value packs on top of the standard, like a GDPR pack, something we also see a lot of value in.
Tell us what excites you the most personally in regards to the technical work being done in ODPi?
Being a real nerd, I love to develop a new standard by really building it from scratch. Unfortunately, I can’t spend all my days coding anymore, so I am limited to reading some of the code that was developed, and to help drive the architecture and design for Egeria.
Building this standard, in my opinion, will be a game changer for the data industry, once we have a way to govern all data in all systems through the metadata, it will take the maturity of data management and governance to a whole new level. Imagine banks like ING delivering data to our regulators through a set of open formats, with the open metadata format on top. Our regulators having full lineage on where the data originated. It would solve all the challenges companies have today on proving that they are in control.
Companies exchanging data will be able to see where their data is being used, and supply usage agreements with that data in an open format. Data being available everywhere with the full metadata, every data consumer understanding what data they look at, the quality, the definitions, in any technology they use. Imagine customers being able to see exactly where their data is, who has access to it, what consent they have given.
Data privacy by design will truly become feasible through such a standard. And we will not stop at the traditional data landscape, it also extends to APIs, events, all the ways data is made accessible. I believe this standard is the beginning of a transformation in data management, and I think it is a very exciting project to work on.