All Posts By

ODPi

How Do I Teach My Second Grade Kid What AI Is?

By | Blog, ODPi BI and AI

By Cupid Chan, CTO, Index Analytics

I recently took my kids to Hersey’s Park in Pennsylvania. In case you haven’t heard about it, it’s just a normal attraction park with rides, and long lines. As we were waiting in line, my son asked, “Dad, what are you doing at work?”

I said, “I help my clients to define KPIs, and then try to apply Naive Bayes to predict the outcome. If the result is not good, we may need to build a neural network, and test it again.”

Do you really think that’s the answer I gave my son? 

OF COURSE NOT!

Not because what I said is wrong, but he is simply not the right audience for that type of response. More importantly, I don’t want him to think “My dad is crazy and I’d better not ask him anything again.”  So, I need to come up with an answer in a language that he can understand. 

If a computer can do work but no one knows whether it’s you doing the work or the computer, that’s AI.” – a basic principle of AI proposed by Alan Turing.

“Great! I can then use AI to do my homework and my teacher would not know that it’s not me doing that!”

Supervised Learning 

“Hmm… Do you remember how you taught your younger sister the difference between a pen and an apple? You hold up a pen in front of her so she can see it and say, ‘pen.’ And you hold up an apple so she can see it and say, ‘apple.’ And you repeat this. Sooner or later, you expect her to understand the long pointy thing is a pen. And the red, round thing is an apple.”

Long, pointed, round, red. These are Features in Machine Learning. And “Pen” or “Apple” are Labels. Combined, this is Supervised Learning. This is one way how a computer can understand that different Features are associated with different Labels in Supervised Learning. 

“Dad, I remember I saw a guy teaching people this on YouTube, too!”

PIKOTARO – PPAP (Pen Pineapple Apple Pen) (Long Version) [Official Video]

Well, the song is funny but it is not related to Supervised Learning. But if it inputs the concept of Supervised Learning for a child, why not let it be?

In the real world, Supervised Learning can help in many different ways. One of them is distinguishing between a cancer cell from a normal cell. In this case, the computer is the “child” and the doctor is the “parent.” By showing examples repeatedly, the doctor trains the computer to distinguish the patterns between a normal cell and a cancer cell.

Unsupervised Learning

You may have heard about the Law of Entropy, or the Second Law of Thermodynamics. In general, unless you put in energy to keep the situation in that current state, the whole condition will just become messier over time.

You can apply the very same law to a kid’s playground. Unless you really put in effort to keep toys tidy, the toys will not automatically go back to their original positions. At my home, my mother-in-law helps out the kids to keep the play areas organized. Once, when she went to Hong Kong for a vacation, the play areas became more disorganized day after day. Finally, my wife had to step in and demand that the kids clean up before grandmother returned. She did not give exact instructions. She just demanded they clean up!

Guess what happened in the next few hours? The kids put all the four-wheels-boxy-shaped things in one area, and we called it “Cars.” And all the fluffy stuff was put together in another area, and we called it “Stuffed Animals.” And then they put all the blocks that can be stacked up together in some boxes and named “Legos.”

They did not get any specific instructions or rules to decide what should go where. But somehow they figured out the similarities and differences. In Machine Learning, this is called Unsupervised Learning.

This is when the computer is given a lot of data points and the computer figures out the pattern by itself. In the real world, Unsupervised Learning can be used in customer segmentation. There is a lot of information and data about a lot of customers. You don’t tell the computer who should be grouped with whom, but this is figured out by Unsupervised Learning. Traditionally, this is done by the expert who observes different patterns, like age, spending pattern, where you live, salary… and then tries to group the types of customers together. And now, we have the machine to play the role of expert, which is able to scan through millions of records in a few seconds but is impossible for any human being

Reinforcement Learning

When dealing with kids, it’s not always the best way to just keep telling them and keep showing them the proper examples. At the same time, it’s not very effective to give no instructions and let them figure out everything by themselves. 

It’s a common practice in teaching kids to reward them when they do something good. And when they do something bad, you punish them. This is intended to reinforce certain behaviors. In Machine Learning, this is known as Reinforcement Learning.

When a computer performs the way that you want, you add a point. When it fails to do what you want, you reduce a point. The computer therefore knows what to do to gain points. 

In the real world, Reinforcement Learning is applied heavily in Robotics. For example, a robot is trying to walk a straight line. It may make it or it may fall down. Whenever the robot falls down, you reduce a point. And whenever the robot successfully makes one step, you add one point. There are many motors and sensors on a robot, and all of them are collecting data for the system. The robot learns what kind of motor speed, what kind of angle is needed in order to keep walking in a straight line and avoid falling.

2 Types of Measurement

2 Popular Questions by Kids – Key Approaches in Machine Learning

Kids like to ask a strangers, “How old are you?” and “Are you a boy or a girl?”

“How old are you?” is asking for a number. It’s Regression.

“Are you a boy or a girl?” is Classification. Looking for an outcome for a pre-defined category. Both are 2 important concepts in Machine Learning.

3 Ways to Learn

Kids observe the world around them. They come up with certain rules. They will propose the result, and they will be corrected by adults. Which makes the rule to get better and better.

Compared to the old way of programming: Developer observes the world. They code rules using rule-based algorithms. And they will come up with some results. Based on this, they will change or modify the rules. 

In AI, it’s a little bit different. Developer creates the AI algorithm and have it create the rule. The algorithm comes up with a model and continue to train it. The model then tries to predict the result and see if it is accurate or not. The key here is that the algorithm keeps modifying the model using more data without the developer being involved. 

That’s the beauty of AI!

No Right or Wrong. Just Right or Left! 

Final question: What are the similarities and differences between Tesla and Uber? They both are both in the automobile industry. But one company, Tesla, creates new technology to help revolutionize the whole car industry. While Uber uses existing technology (like mapping, mobile app..etc) to create a new business model.

So the power of AI is not just in making algorithms. It can be using existing algorithms to build new ways of doing business. One builds the technology, one utilizes it.

Remember my son who was thinking about ways to get his homework done? Ultimately, I would be equally proud if he came up with an algorithm that could do his homework and successfully fool his teacher or if he utilized existing algorithms to do the same thing. Both are important new ways of adopting AI to solve problems. 

There is no Right or Wrong, only Right or Left. But no matter which direction you pick, be persistent and you will cross the finish line of success via either route – Cupid Chan tweet on Nov 28, 2018

The content of this blog has been presented in a few national and international conferences such as Open Source Summit in Shanghai China and MicroStrategy Federal Summit in Washington DC. I also captured this in my very first YouTube channel video which you can find here: https://www.youtube.com/watch?v=dh9xz4SBukE&t=13s 

Twitter: @cupidckchan

Linkedin: www.linkedin.com/in/cupidchan/ 


ODPi Announces New Egeria Conformance Program to Advance Open Metadata Exchange Between Vendor Tools

By | Announcements, ODPi Egeria

SAN FRANCISCO, February 11, 2019 – ODPi, a nonprofit Linux Foundation project, accelerating the open ecosystem of big data solutions, today announced the ODPi Egeria Conformance Program, which ensures vendors who ship ODPi Egeria in their product offerings are delivering a consistent set of APIs and capabilities, such that data governance professionals can easily build an enterprise-wide metadata catalog that all their data tools can easily leverage.

Egeria is one of the open source projects under the ODPi umbrella. ODPi aims to be a standard for simplifying, sharing and developing an open big data ecosystem.

“Open metadata and governance is incredibly valuable IT operating environments. The ODPi Egeria ecosystem is taking a big step today aimed at fulfilling the promise of delivering useful metadata exchange capabilities and vendors are beginning to sign up to the standards,” said John Mertic director of program management, ODPi. “By adopting ODPi Egeria standards and implementation as the core of your metadata management and governance program, an organization is able to future-proof their investments and be able to adopt the best-of-breed tools for their business.”

Open metadata and governance is a key part of the standardization of IT operating environments. If software and data components can be described in a common way, including the relationships between them, and annotated with governance requirements then it becomes much easier to automate deployments and optimize workload deployments. These are valuable outcomes for any company dealing with big data.

The ODPi Egeria Conformance program makes it possible for vendors to test their products to ensure their conformance to project standards and provides exclusive marks to use  in customer facing support materials. Conformance is accomplished through a self-testing program.

The Conformance program has been designed to aid businesses who are dealing with metadata and will quickly see the benefits of adding Egeria conformance to the list of requirements for new software tool purchases. Both IBM and SAS, leading vendors of data governance tools who have contributed to ODPi Egeria since it’s inception, have committed to ship ODPi Egeria Conformant products in 2019. Many more vendors are evaluating ODPi Egeria and will announce their conformance at a later date.

ODPi Egeria, a new project from ODPi launched in August 2018, supports the free flow of metadata between different technologies and vendor offerings. Egeria enables organizations to locate, manage and use their data more effectively. In addition, it provides governance features that smooth over the gaps between different vendor offerings enabling organization to have a complete and highly automated governance program

“ODPi Egeria brings much-needed standards to the world of metadata management and governance,” said Jay Limburn, IBM Distinguished Engineer and Director of Offering Management, Unified Governance and Integration Products. “The work aligns well with our unified governance strategy and we look forward to continuing our work with ODPi to deliver products based on ODPi Egeria and the ODPI Egeria Conformance Program.”

“The ODPi Egeria technology is advancing rapidly due to the support of companies such as IBM, ING and SAS. The project is less than a year old and already it is being embedded in key products,” said Mandy Chessell, lead for the ODPi Egeria project. “The launch of the conformance program is the next phase in its maturity, enabling vendors to advertise that their software can collaborate in the ODPi Egeria ecosystem. By delivering the conformance suite as open source, we are enabling organizations to verify that any technology they are considering purchasing will operate correctly in an ODPi Egeria ecosystem,”

“As a maintainer of the ODPi Egeria project, we are thrilled to see the next step in its maturity, with the ODPi Egeria Conformance program,” said Craig Rubendall, Vice President, Platform Research and Development, SAS.  “This program is critical to ensure the consistency and quality of the solutions integrating with and leveraging the ODPi Egeria open metadata standards. As SAS rolls out products that have this support we can be confident it is being done in a way that ensures the interoperability goals set by ODPi Egeria.”

Additional Resources

About ODPi
ODPi is a nonprofit organization committed to simplification and standardization of the big data ecosystem. As a shared industry effort, ODPi members represent big data technology, solution provider and end user organizations focused on promoting and advancing the state of big data technologies for the enterprise. For more information about ODPi, please visit: http://www.ODPi.org

About The Linux Foundation

The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and commercial adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage.

Media Contact
Nancy McGrory
The Linux Foundation
nmcgrory@linuxfoundation.org