Skip to main content

Dynamics of Selecting your Open Source AI


The landscape of open source AI is big. To identify suitable open source tools to make your AI dream product is a herculean task. Selecting an AI toolkit for your product might turn out costly when you need to scale your software, thus it turns out to be a strategic decision. We at CereLabs have developed a criteria to choose Open Source AI Toolkit.


  1. Vision/ Reason for open source

If you need to trust an open source platform, you need to start with the vision statement with which the open source AI platform is launched. The vision statement portrays the commitment of the company or community towards the toolkit.    

Following are the visions of few of the reputed AI Open Source Platforms:

      OpenCog: “OpenCog is a unique and ambitious open-source software project. Our aim is to create an open source framework for Artificial General Intelligence, intended to one day express general intelligence at the human level and beyond. That is: We're undertaking a serious effort to build a thinking machine.


    Tensorflow by Google: “By sharing what we believe to be one of the best machine learning toolboxes in the world, we hope to create an open standard for exchanging research ideas and putting machine learning in products. Google engineers really do use TensorFlow in user-facing products and services, and our research group intends to share TensorFlow implementations along side many of our research publications.”


DMLT by Microsoft: “We believe that in order to push the frontier of distributed machine learning, we need the collective effort from the entire community, and need the organic combination of both machine learning innovations and system innovations. This belief strongly motivates us to open source the DMTK project.”

 Theano (When it was launched): “This is the vision we have for Theano. This is give people an idea of what to expect in the future of Theano, but we can’t promise to implement all of it. This should also help you to understand where Theano fits in relation to other computational tools.
    • Support tensor and sparse operations
    • Support linear algebra operations
    • Graph Transformations    
        • Differentiation/higher order differentiation        
        • ‘R’ and ‘L’ differential operators        
        • Speed/memory optimizations        
        • Numerical stability optimizations        
    • Can use many compiled languages, instructions sets: C/C++, CUDA, OpenCL, PTX, CAL, AVX, ...
    • Lazy evaluation
    • Loop
    • Parallel execution (SIMD, multi-core, multi-node on cluster, multi-node distributed)
    • Support all NumPy/basic SciPy functionality
    • Easy wrapping of library functions in Theano”

It should be noted that most of the promises made by Theano in their vision were later fulfilled, thus proving their commitment towards a full fledged open source AI.


   
  1. Machine Learning Libraries

An ideal AI toolkit will have all necessary Machine Learning (ML) libraries that will     assist you in all your AI requirements. There are numerous needs that one needs to cater to build AI products, for which the libraries should have all machine learning algorithms in place. Today most of AI toolkits have all the necessary libraries that support all trending research in Machine Learning. Any AI toolkit must support Supervised learning, Unsupervised learning and Reinforcement learning. To achieve that the libraries must have atleast Support Vector Machines, Artificial Neural Networks, Clustering algorithms and Bayesian Networks, to fulfil your basic AI needs.

   
  1. Support
       
The extent of     support provided by the AI open source provider sends a clear message of the willingness to improve the AI toolkit. Tracking the changes made in the toolkit over a period of time gives a fair idea about their commitment. Google launched tensorflow with Python 2.7. There was lot of demand for Python 3 support. Within weeks Google came out with supporting Python 3 for Tensorflow, and thus proving its commitment towards continuous support. The promises made by Theano in their vision statement were fulfilled in later releases, from supporting tensors to better support for GPUs. You need to follow the release updates to get a clear picture of how well the AI needs of your product will get fulfilled.

  1. Followers

The number of followers of the open source AI toolkit gives you an impression of     the fan base that toolkit has. More followers means more improvements in future releases. It also helps in testing the toolkit at a larger scale, which a limited toolkit might miss out.     AI works on data, and a huge user base provides this data to make the toolkit more intelligent. Data is the main reason for major corporations making their toolkit open source, and general consensus is that only the toolkit with enough data will be a winner in this never ending race. Only the number of followers will decide the fate of any open source AI toolkit.
       
  1. Hardware Compatibility

AI is about making the machine learning algorithms work faster. All major open source toolkits have a strong integration and support for GPUs. With few lines of code you can distribute your code on multiple GPUs.

Facebook has recently open sourced 'Big Sur', its hardware design for AI. The release statement of 'Big Sur' says this:

“We want to make it a lot easier for AI researchers to share techniques and technologies. As with all hardware systems that are released into the open, it's our hope that others will be able to work with us to improve it. We believe that this open collaboration helps foster innovation for future designs, putting us all one step closer to building complex AI systems that bring this kind of innovation to our users and, ultimately, help us build a more open and connected world.”


Check your hardware needs for your product, and decide on your toolkit selection strategy. Building a product and then hunting for hardware optimizations might turn out expensive if you realize your toolkit doesn't offer strong GPU support.

  1. Performance

Every new version of any AI toolkit comes up with both software and hardware performance improvements. The commitment for support leads to innumerable performance improvements. The kind of processing needs your product has will help you choose an ideal AI toolkit. Performance deserves an entire article, and we will cater to this in future posts.    

  1. Documentation
A serious effort towards providing documentation for the toolkit, reflects the dedication towards the toolkit's user community. A thorough documentation along with tutorials will help AI programmers to     easily adopt the toolkit, and thus increasing the user base of the toolkit. The commitment of Theano and Tensorflow to provide a detailed documentation along with tutorials is helping them to attract more followers.
   
  1. Available Skillset in the Market

The programming language support provided by the AI toolkit will ensure that there     is enough skillset available in the market for you to hire. The reason for Google to choose Python as a language for tensor flow is that Python has a vast support for numerous libraries, specially in NLP. Google has promised to add support for other languages including Java. Such steps ensure that your product has strong support to hire new AI engineers.

As more and more companies open source their AI, it will be a challenging task to select an ideal AI toolkit for your product. Proper planning and critical thinking to select your AI toolkit using the above criteria will give you enough leverage to make successful AI products.

Comments

  1. Wow, amazing blog layout! How long have you been blogging for? you make blogging look easy. The overall look of your website is fantastic, let alone the content!

    Best 3D animation Company
    Best Chatbot Development Company
    Mobile app development in Coimbatore

    ReplyDelete

Post a Comment

Popular posts from this blog

Implement XOR in Tensorflow

XOR is considered as the 'Hello World' of Neural Networks. It seems like the best problem to try your first TensorFlow program.

Tensorflow makes it easy to build a neural network with few tweaks. All you have to do is make a graph and you have a neural network that learns the XOR function.

Why XOR? Well, XOR is the reason why backpropogation was invented in the first place. A single layer perceptron although quite successful in learning the AND and OR functions, can't learn XOR (Table 1) as it is just a linear classifier, and XOR is a linearly inseparable pattern (Figure 1). Thus the single layer perceptron goes into a panic mode while learning XOR – it can't just do that. 

Deep Propogation algorithm comes for the rescue. It learns an XOR by adding two lines L1 and L2 (Figure 2). This post assumes you know how the backpropogation algorithm works.



Following are the steps to implement the neural network in Figure 3 for XOR in Tensorflow:
1. Import necessary libraries
impo…

From Cats to Convolutional Neural Networks

Widely used in image recognition, Convolutional Neural Networks (CNNs) consist of multiple layers of neuron collection which look at small window of the input image, called receptive fields.
The history of Convolutional Neural Networks begins with a famous experiment “Receptive Fields of Single Neurons in the Cat’s Striate Cortex” conducted by Hubel and Wiesel. The experiment confirmed the long belief of neurobiologists and psychologists that the neurons in the brain act as feature detectors.
The first neural network model that drew inspiration from the hierarchy model of the visual nervous system proposed by Hubel and Wiesel was Neocognitron invented by Kunihiko Fukushima, and had the ability of performing unsupervised learning. Kunihiko Fukushima’s approach was commendable as it was the first neural network model having the capability of pattern recognition similar to human brain. The model gave a lot of insight and helped future understanding of the brain.
A successful advancement i…

Understanding Projection Pursuit Regression

The following article gives an overview of the paper "Projection Pursuit Regression” published by Friedman J. H and Stuetzle W. You will need basic background of Machine Learning and Regression before understanding this article. The algorithms and images are taken from the paper. (http://www.stat.washington.edu/courses/stat527/s13/readings/FriedmanStuetzle_JASA_1981.pdf
What is Regression? Regression is a machine learning technology used to predict a response variable given multiple predictor variables or features. The main distinction is that the response to be predicted is any real value and not just any class or cluster name. Hence though similar to Classification in terms of making a prediction, it is largely different given what it’s predicting. 
A simple to understand real world problem of regression would be predicting the sale price of a particular house based on it’s square footage, given that we have data of similar houses sold in that area in the past. The regression so…