Category

Events

Managing Privacy in the GDPR-era

By | Blog, Events, ODPi Egeria

 

Now that the EU General Data Protection Regulation (GDPR) is in full effect, businesses both large and small have made changes to be fully compliant, regardless of where they are located. The changes include more regulation for how companies collect data, how they store it, keep it safe from hackers and use it in their day-to-day activities. Some people think GDPR as ‘giving the power over data back to the user’. GDPR replaced old data privacy laws that were set up in 1995 and that have been obsolete for some time now.

But what does this mean for the consumer?

According to this Marketing Week article, consumers don’t understand how brands use their data. In fact, 48% of consumers still don’t understand where and how organizations use their personal data. This is up from 31% when the research was last conducted two years ago.

Only 7% feel they have a good understanding of how companies use their data, with 45% saying they “somewhat understand,” but 18% believe businesses treat people’s personal data in an honest and transparent way.

This is where ODPi comes in. ODPi’s Data Governance initiative aims to create an open data governance ecosystem through collaboration with data governance subject matter experts and data platform and tools vendors. On Thursday, July 12, ODPi is hosting a webinar focused on managing privacy.

Mandy Chessell, distinguished engineer and master inventor at IBM, will share best practices for how IBM manages data that keeps individuals’ privacy respected and is compliant with new regulations on data privacy such as the EU GDPR.

Attendees will learn:

  • The life cycle of a digital service as it is developed, sold, enhanced and used. This life cycle breaks the work into six stages. Each stage describes the roles and the activities involved to ensure data privacy.
  • The types of artifacts that need to be collected about a digital service and the methods used to develop it.
  • How these artifacts link together in an open metadata repository (data catalog).

Click to learn more or to register for the webinar.

The Rise of Big Data Governance: Strata Data Conference and DataWorks Summit Sessions, Webinar, RedGuide and More!

By | Blog, Events, ODPi Egeria

Each of today’s most forward-thinking enterprises have been forced to face similar data challenges: the reliance on real-time data to better serve their customers and, subsequently, the requirement of complying with regulations to protect that data, such as the EU’s General Data Protection Regulation (GDPR).

ODPi Data Governance PMC is working to create a neutral, industry-wide approach to data governance. Together, they are supporting the mission of creating an open data ecosystem through collaboration with subject matter experts and data platform and tools vendors.

Below please find upcoming speaking sessions, Meetups, webinars and a RedGuide meant to further the discussion and work of Data Governance.

March 6–8, 2018

Strata Data Conference

San Jose, CA

The rise of big data governance: Insight on this emerging trend from active open source initiatives

Speakers:

 Maryna Strelchuk (ING)

 John Mertic (ODPi)

Time: 1:50pm–2:30pm

Date: Wednesday, March 7, 2018

https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/64048

John Mertic and Maryna Strelchuk detail the benefits of a vendor-neutral approach to data governance, explain the need for an open metadata standard, and share how companies like ING, IBM, Hortonworks, and more are delivering solutions to this challenge as an open source initiative. The solution to this emerging challenge is a tricky one. For companies like ING, this data governance challenge has been met with metadata, a consistent view across a large heterogeneous ecosystem, and collaboration with an active open source community.

—————————-

April 16-19, 2018

DataWorks Summit

Berlin, Germany

The rise of big data governance: Insight on this emerging trend from active open source initiatives

Speakers:

 Ferd Scheepers (ING)

 John Mertic (ODPi)

https://dataworkssummit.com/berlin-2018/

Attendees will understand the role of metadata, the need for a cross-technology view on metadata, the role of Apache Atlas as a reference implementation, and the role of ODPi in offering value-added services, such as certification.

ODPi Data Governance PMC

Hosted by:

 Mandy Chessell (IBM)

https://dataworkssummit.com/berlin-2018/bofs/

This Birds of Feather (BoFs) sessions, hosted by IBM, ING, ODPi, and Hortonworks will include discussions around the ODPi Data Governance PMC. Come and share your experiences, challenges, future interests.

—————————-

April 26, 2018 at 9am PST/ 12pm EST

ODPi Webinar

Speakers: Mandy Chessell (IBM), John Mertic (ODPi)

Topic – Discussion of the IBM Redguide “The Journey Continues: From Data Lake to Data-Driven Organization”, an overview of the ODPi Data Governance PMC and a look at what’s to come this year.

Sign up here: https://www.odpi.org/projects/data-governance-pmc 

Check @ODPi on Twitter for details soon!

—————————-

Download Now!

The Journey Continues: From Data Lake to Data-Driven Organization

Written by Mandy Chessell (IBM), Ferd Scheepers (ING), Maryna Strelchuk (ING), Ron van der Starre (IBM), Seth Dobrin (IBM), and Daniel Hernandez (IBM)

http://www.redbooks.ibm.com/Abstracts/redp5486.html?Open  

This IBM Redguide™ publication looks back on the key decisions that made the data lake successful and looks forward to the future. It proposes that the metadata management and governance approaches developed for the data lake can be adopted more broadly to increase the value that an organization gets from its data. Delivering this broader vision, however, requires a new generation of data catalogs and governance tools built on open standards that are adopted by a multi-vendor ecosystem of data platforms and tools.

Work is already underway to define and deliver this capability, and there are multiple ways to engage. This guide covers the reasons why this new capability is critical for modern businesses and how you can get value from it.

ODPi Webinar on How BI and Data Science Gets Results

By | Blog, Events, ODPi BI and AI

By John Mertic, Director of ODPi at The Linux Foundation

ODPi recently hosted a webinar on getting results from BI and Data Science with Cupid Chan, managing partner at 4C Decision, Moon soo Lee, CTO and co-founder of ZEPL and creator of Apache Zeppelin, and Frank McQuillan, director of product management at Pivotal.

During the webinar, we discussed the convergence of traditional BI and Data Science disciplines (machine learning, artificial intelligence… etc), and why statistical/data science models can now run on Hadoop in a much more cost effective manner than a few years ago.

The second part of the webinar focused on demos of Jupyter Notebooks and Apache Zeppelin. These were important and relevant demos, as Data Scientist utilize Jupyter Notebooks the most and Apache Zeppelin supports multiple technologies, multi-languages & environments; making it a great tool for BI.

The inspiration for the webinar was the new Data Science Notebook Guidelines. Created by the ODPi BI and Data Science SIG, the guidelines help bridge the gap so that BI tools can sit harmoniously on top of both Hadoop and RDBMS, while providing the same, or even more, business insight to the BI users who have also Hadoop in the backend. Download Now »

Additionally, webinar listeners asked detailed questions; including:

  • How can one transition from a bioinformatics developer to Data scientist in Bio-statistic?
  • Where do you see the future of both Jupyter and Zeppelin going? Are there other key data science challenges needing solved by these tools?
  • When do you choose to use one notebook over the other?
  • Can the 2 notebooks be used together?  i.e., can you create a Jupyter notebook and save it, then upload it into Zeppelin (or vice versa)?

Overall, the webinar was an insightful discussion on how we can achieve big data ecosystem integration in a collaborative way

If you missed the webinar, Watch the Replay and Download the Slides.

ODPi Takes Big Data Day LA

By | Blog, Events

By Roman Shaposhnik

Earlier this month, I attended Big Data Day LA – a vibrant community gathering of data and technology enthusiasts in sunny Los Angeles. Located on the USC campus, the 5th annual event was organized by local Big Data user groups and volunteers!

Unlike many of the big data industry’s events, Big Data Day LA wasn’t a company-owned conference and registration was fully covered by data-driven sponsors, like Hortonworks, Disney Interactive, WANdisco and more – making the event free for anyone to attend. As such, it wasn’t surprising that it attracted such a big crowd, with more than 1,500 people in attendance!

During the conference, I presented “Big Data on The Rise: Views of Emerging Trends & Predictions from real life end-users” where I offered the audience an overview of key trends emerging in 2017 within the Hadoop and Big Data ecosystem. My session also covered data from the ODPi End User Advisory Board (TAB) and real end-user perspectives on how companies are using Big Data tools, challenges they face and where they are looking to focus investments. My talk was well received by those in attendance and quite a few people approached me following the session to discuss their new understanding of ODPi and how it relates to traditional vendors, the Apache Software Foundation, the Linux Foundation and the enterprise.

The remainder of the conference featured a great selection of talks, especially for data scientists and software developers and, unsurprisingly, the entertainment industry track was a huge hit – featuring talks from Netflix, Warner Brothers, Guitar Center and more.

And, of course, being a Silicon Valley guy I simply had to check out all the startup buzz at the conference as well. Not only did the conference feature an awesome Startup Showcase track, but there were also quite a few presentations pushing the envelope on state-of-the-art machine learning. Just to give you a quick taste, I suggest you go to http://novamente.ai/ and check out the truly SciFi projects these guys are tackling. Their presentation on how to apply AI to producing ever higher grossing movies (think scripts, casting, visual effects and more) had the audience on the edge of our seats.  

A few other sessions I particularly enjoyed include one from Bain & Company where they tried to put big data and machine learning in the context of organizational shifts required in any traditional enterprise, in order to realize full value from big data insight. On the flip side, even if you have your organization all lined up for digital transformation, you still have to be mindful of challenges on the technical side of machine learning. The notion of a hidden, growing technical debt in these complex, end-to-end machine learning pipelines is something that we all should keep in mind and Irina Kukuyeva’s presentation did a great job of highlighting some of the same important areas.

Overall, Big Data Day LA was a fun and dynamic event that harnessed the upstream community and showcased the importance data has on each level of business.