These days everything seems to end with tech. Healthtech, Agritech, Biotech, Pharmtech, Fintech and every other type of tech. Within tech we have low tech, high tech and deep tech. There is a diversity of people involved in tech from the humanities to the STEM folks to everyone in between. In all honesty, tech has now become a rallying call for all.
For the past few weeks – I had placed more focus on writing about human relations and the complexities and interests that they foster. In this article, I am revisiting a technology focused write up, a primer of sorts on a topic we all know too well!
Deep Learning – what it is and a few things to know.
Here is how to think about it. The broad form of the entire field is called Artificial Intelligence – This is the use of machines to make intelligent decisions that humans would otherwise make. A machine can be hard, like robots, computers or soft like bots or other software based machines.
So anytime and every time you have a machine functioning by making a decision that is Artificial Intelligence in motion. Let us look at some examples – factories that use automation to fast track their work or use robots to move products along a manufacturing line. Another example are self-driving cars, and modelling formulation for cancer treatment regimens to be as specific as possible. The truth is everyone of us encounters AI applications everyday and quite often. Some people believe that because of the potential breadth of definition that the baseline for calling anything AI should be that it performs as least as smart as a human. Well, that is a conversation for another time or even another person. AI systems started off as rule based, to which logic and expert knowledge were applied to yield a result. For example, early approaches used for language translation of texts used rules designed by grammar experts. Statistical approaches where further introduced as more text-based data became available in native languages and needed to be translated to other languages. Some languages lent themselves better to rules-based approaches while others did not, creating limitations. The ability to use expert participation means that the outcomes would more readily be of high quality and the learning required for the AI was modest. This of course limited its application.
To broaden its usefulness, AI infrastructure started to include modest quantities of flexible parameters which the AI can learn and respond to
Within AI is Machine Learning – This is where a lot of the economics of AI can be harnessed. Machine Learning is defined as saying that not only are machines capable of making decisions, they can become increasingly smarter because they are capable of learning from the data they are given. This means they have the ability to do pattern recognition, do analyses and make decisions based on the information they are given, and they can do this with minimal human intelligence. “Machine learning uses algorithms to parse ( break up) data, learn the data and used that learning to make informed decisions” (Google.com)
Within Machine Learning is Deep Learning – Rechristened from neural networks/artificial neural networks (ANN) of which it is the backbone. ANNs are a network of artificial neurons or nodes that can be used to create AI related solutions. Deep learning structures algorithms in layers that create an ANN and uses the ANN framework to imitate the way humans gain insights, knowledge and perspectives that allow them to solve complex problems. The ANN can learn and make intelligent decisions on its own.
I know at this point you might be thinking that all these definitions are saying the same thing but going round in circles, so I hope this helps. AI is the field; ML is machines making intelligent decisions but following the process in a linear format to get to an end point. Deep Learning on the other hand behaves more like a brain with the ability to take information at different points, in different orders and deduce the right decisions from the complex sets of inputs it was served.
So now you know the basics on Deep learning – let’s get after the rest of it!
Imagine that you had a set of blocks and needed to determine how many combinations are possible. That is relatively easy, there would be 4! (4 factorial) possibilities which is 24 possibilities. Like I said relatively easy to solve. But imagine this, a simple game like chess has four distinct key outcomes (as a chess player, I know there are more): checkmate, draw (including stalemate), resignation and timeout. However, if you were asked to name how many possible board configurations to arrive at these four outcomes it would most certainly comprise an insane number of ways as there are so many configurations of plays and strategies that can get you one of those outcomes. It would take the ability of some of the most experienced and talented world class chess champions. Also, within that experience they are likely to uncover new configurations of arriving to any of the outcomes. If you were asked another question which is to come up with how many squares you can make up on a chess board, while it is not as trivial as the first example of four blocks, it is relatively easy to solve. First, we know that the entire chess board has 64 squares in a 1×1 framework, we can then migrate to a 2×2, 3×3…all the way up to 7×7, knowing that the 8×8 format is 1 square that comprises the entire board. Of these three examples only the second example possess enough complexity to leverage a Neural Network or Deep Learning architecture.
In 2016, when Google’s AlphaGo (a computer programme) played against Lee Sedol, the 18-time world champion and defeated him with a 4 – 1 victory, that was a huge gamechanger. As FYI, Go is a complex game that has 10 to the power of 170 possible board configurations, hence classified as a googol time more complex than chess. AlphaGo’s success demonstrated the unbridled capability of AI and specifically Deep Learning application to complex issues.
We now know about what Deep Learning can do, how much it can assist and be used to “think” like humans and solve complex problems. Let us look at the environment in which we do this.
Obviously by the sheer complexity of the types of problems that a Deep Learning architecture must tackle, we can quickly surmise that the computational systems will differ in horsepower and configuration from the conventional computers used to run less data-strenuous work. Deep Learning neural networks have an appetite for two things a. Data – in a previous paper I wrote that “Data is the oxygen of AI”, and b. compute power.
- The part about data is the reason AI is relevant in the first place – basically, we have all this data in so many sectors that possess a high potential intrinsic value when techniques such as pattern recognition, analytics, decision trees, complex computations are applied. If there was no big data, we would still have AI right up to Machine Learning boundary. However, the bulk of what gives AI its larger than tech-life personality is the availability of data, lots of it, especially big data (extremely large inter-related data sets – people sometimes assume big data means volumes of unrelated small chunks of data).
- On the need for specialty computer engines, the demands of Deep Learning include large numbers of connections/nodes within a network, and thereby humongous demand for compute power. Now computers were generally moving down this path already with CPU processors improving over decades in accordance with Moore’s Law to culminate in over 10 million X per second increase in computational power. These super computers have been instrumental in tackling the processing requirements for the dense and complex neural networks computations. They have to be strong enough to learn complex parameters of unbridled flexibility, comprising large numbers of inputs with a myriad of combinatorial options and data forms. This is the environment in which Deep Learning functions. It is also this very capability that makes Deep Learning very versatile and applicable to so many sectors. The training that an Artificial Neural Network or Deep Learning system undergoes to break down the parameters of each input in order to model its constituent elements, model the unknowns via probabilities of what that information depicts, trains itself to recognise and extract the usefulness of such data to solve countless broad based problems; in many ways defies fundamental math and physics concepts. In Deep Learning, the training process improves the quality of the interpretation of the data including iteratively creating the ability to reorganise input parameters via random initialisations to allow for better modelling, processing, interpretation and hence better outcomes. This is what most mimics the human brain. The ability to get better in decision making, approach to a problem, information intake, processing, usage, knowledge and outcomes.
Now with all this great ability and flexibility for AI’s Deep Learning that are applicable in just about every sector from education to healthcare, to retail, to manufacturing, biotech, safety, agriculture, logistics, entertainment and just about every known field, there is a cost.
The inherent value and the fascinating fact about the neural networks in a human brain is that with use there is continuous learning and with learning; better and faster decision-making processes, better ability to handle more information, better ability to process complex details and ultimately better outcomes. These improvements can be made by statistically significant factors every time the brain undergoes the decision process. For a human brain to improve its predictive outcome by a factor of X, the process to that improvement may entail a collection of processes that may involve a tremendous number of input variables or a modest amount. They may be dependent on a host of internal and external factors including the human brain, mind, body, environment, tools etc. Also, to achieve the improvement, the process might vary from person to person. In the same vein, the Deep Learning infrastructure makes the leap in performance through its similarly modelled artificial neural network infrastructure as explained before. The issue though is that in order to improve performance by a factor X, the model must be trained by an exponential increase in input data points.
For example, in order to improve the predictions of COVID sufferers through X-ray data from 89% True positives to 95% True Positives, Okezue a product developer and AI researcher trained his AI model on an additional 2X the volume of diverse inputs. Okezue could have engaged an expert oncologist to help in sorting the X-ray images so that the class of inputs he trains his model on would have a high level of accuracy, with a quick learning time, and require a low number of computations and the input size required could be significantly reduced. The issue there is that his model would also be stunted in application. While his model will likely be able to predict a COVID victim from a pneumonia or lung cancer victim on the identified criteria the expert provided, if his model were to encounter a number of X-rays that had additional parameters not identified, or incorrectly specified by the expert oncologist, his model might generate a number of false negatives which would be unfortunate and require a multi-factor process of verification for more accurate diagnosis or any level of usefulness to be extracted from it.
He created a Deep Learning algorithm that is flexible in its approach and able to learn from a wide and diverse number of inputs. His solution could be classified as less efficient as it would require a larger number of computations to achieve the same performance ratio as an expert system. However, over large amounts of input data, his flexible model will significantly outperform a fixed model with relevant variables dictated by expert oncologist input. Okezue’s Deep Learning bigger model framework requires more computing to train more data, which equates to more computing power usage! If he targeted to reduce error rate, he could add more parameters or permutations of the inputs for training his model and hence more computations to achieve the targeted outcome, therefore even more computational power.
The cost to the improved performance that Deep Learning is capable of is in computational power. The hardware may include cost of GPUs (General Processing Units), optical or quantum systems, FPGAs (Field Programmable Gate Arrays), ASICs (Application Programmable Integrated Circuits), memory etc. and the cost of all software resources as well. For example, Google’s team DeepMind created the AlphaGo computer that beat Lee Sedol on the Go, by spending $35m on the system. OpenAI (an open-source AI research and deployment company), built the GPT-3 (their Deep Learning language system) for $4M.
As Deep Learning proliferates and systems move from lab to market, the cost of these systems will play a big role. They will require Deep Pockets to deploy, the hope is that the cost of development will eventually drop to counteract any hindrance in field deployment. It will require all hands on deck, nonetheless the compute engines to enable Deep Learning are a beast and will continue to require advancing innovation!
About Ngozi Bell
Inspiration, Hard Work, Innovation. These three foundational elements anchor Ngozi’s core belief that manifesting the extraordinary is always within reach. Inspired by her mother A.C.Obikwere, a scientist and author, she learned the privilege of living at the edge of important encounters and dedicating herself to robust and perpetual learning. Ngozi’s background is a combination of Physics, Engineering, Venture Capital/Private Equity, regulations, and business where she has managed over $1B in cumulative revenue. Ngozi is a speaker, storyteller, and writer on a diverse set of topics including AI, iDLT, ML, Signal Processing, iOT, women, entrepreneurship and more. She contributes regularly to VOA, has been a TEDx speaker and is published on tech and non-tech platforms. She is a champion of STEM, women, youth, art and the Africa we must engage. Ngozi is an adjunct professor of Physics and management with work
experience in Asia, Europe, Africa, Middle East, and North America. She is a founder of a number of a number of enterprises and host of the podcast Stem, Stocks and Stews (https://anchor.fm/stemstocksstews-podcast).