Machine learning has come a long way since its inception in the 1950s. The first machine learning program, called Logical Theorist, was developed by Allen Newell and Herbert Simon in 1956.
This program was designed to simulate human problem-solving abilities and was a major breakthrough in the field of artificial intelligence. The Logical Theorist used a combination of logic and reasoning to solve problems.
In the 1960s, the concept of machine learning began to take shape, with researchers like Frank Rosenblatt developing the perceptron, a type of feedforward neural network. The perceptron was a significant innovation in machine learning, but it had limitations that would be addressed in later decades.
The perceptron's limitations led to a decline in interest in machine learning, but it paved the way for the development of more advanced algorithms and techniques in the 1980s.
Check this out: Perceptron Algorithm Machine Learning
A Brief
Deep learning is a more evolved branch of machine learning that uses layers of algorithms to process data and imitate the thinking process. It's often used to visually recognize objects and understand human speech.
The first layer in a deep learning network is called the input layer, while the last is called an output layer. All the layers between input and output are referred to as hidden layers.
Feature extraction is another aspect of deep learning, used for pattern recognition and image processing. It uses an algorithm to automatically construct meaningful “features” of the data for purposes of training, learning, and understanding.
Machine learning, a subdivision of AI, uses algorithms and neural network models to assist computer systems in progressively improving their performance. It's a necessary aspect of modern business and research for many organizations today.
The model of brain cell interaction that machine learning is based on was created in 1949 by Donald Hebb in his book "The Organization of Behavior." Hebb's theories on neuron excitement and communication between neurons were a key part of this model.
Expand your knowledge: On the Inductive Bias of Gradient Descent in Deep Learning
The Dartmouth Conference and AI Origins
In 1956, a small group of researchers from various disciplines gathered at Dartmouth College for a summer-long workshop focused on investigating the possibility of "thinking machines." They believed that every aspect of learning or intelligence can be precisely described, making it possible for a machine to simulate it.
John McCarthy, one of the key figures behind the conference, coined the term "artificial intelligence" to describe the practice of human-like machines. He envisioned the possibility of machines that could learn and adapt, much like humans.
During this workshop, John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon co-founded the field of artificial intelligence, and the term "artificial intelligence" became widely recognized.
1958
In 1958, Frank Rosenblatt developed the perceptron, an early Artificial Neural Network (ANN) that could learn from data and became the foundation for modern neural networks.
This innovation was a significant step towards creating machines that could think and learn like humans.
The perceptron was a major breakthrough in AI research, paving the way for the development of more sophisticated neural networks that could process and learn from complex data.
This technology has since been used in a wide range of applications, from image recognition to natural language processing.
If this caught your attention, see: Hidden Layers in Neural Networks Code Examples Tensorflow
The perceptron's ability to learn from data was a key aspect of its design, allowing it to improve its performance over time.
This concept of learning from data is still a fundamental aspect of AI research today, with many modern AI systems relying on similar techniques to improve their performance.
Additional reading: Elements in Statistical Learning
Dartmouth Conference
In 1956, a group of researchers from various disciplines gathered at Dartmouth College for a summer-long workshop focused on investigating the possibility of "thinking machines." This workshop is widely recognized as a founding event of the AI field.
The group, led by John McCarthy, a mathematics professor at Dartmouth, believed that every aspect of learning or any other feature of intelligence could be precisely described and simulated by a machine. They were pioneers in their field, exploring uncharted territory.
John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon coined the term artificial intelligence in a proposal for the workshop. This term has since become synonymous with the field of study.
Allen Newell, Herbert Simon, and Cliff Shaw wrote Logic Theorist, the first AI program deliberately engineered to perform automated reasoning. This program marked a significant milestone in the development of AI.
The Dartmouth conference was a pivotal moment in AI history, laying the groundwork for the field's future growth and innovation.
AI's First Decades
In the 1950s, computing machines were essentially large-scale calculators, with organizations like NASA relying on human "computers" to solve complex equations.
The Dartmouth Conference in the 1950s laid the groundwork for the field, and its excitement grew over the next two decades.
In the 1960s and 1970s, the early signs of progress in AI emerged, including a realistic chatbot and other inventions that hinted at the possibilities of artificial intelligence.
Kunihiko Fukushima released work on neocognitron, a hierarchical, multilayered ANN used for pattern recognition tasks, in 1979.
The early excitement of the Dartmouth Conference continued to grow, setting the stage for the significant growth of AI in the years to come.
Laying the Groundwork: 1960s-1970s
The 1960s and 1970s were a pivotal time for AI research, building on the excitement generated by the Dartmouth Conference. This period saw significant progress in the field, with early signs of success emerging in the form of a realistic chatbot and other inventions.
In 1966, the Artificial Intelligence Center at the Stanford Research Initiative developed Shakey the Robot, a mobile robot system equipped with sensors and a TV camera that could navigate different environments.
Shakey's abilities, although crude compared to today's developments, helped advance elements in AI, including visual analysis, route finding, and object manipulation. This marked an important milestone in the development of AI, showcasing the potential for robots to function independently in realistic environments.
The 1960s also saw the creation of the perceptron, a machine designed for image recognition, which was initially planned as a machine, not a program. The software, originally designed for the IBM 704, was installed in a custom-built machine called the Mark 1 perceptron.
AI Enthusiasm Wanes: 1980s-1990s
The AI winter that began in the 1970s continued throughout much of the following two decades.
In 1974, a critical report by Sir James Lighthill led to stark funding cuts, further exacerbating the issue.
The term "AI winter" was first used in 1984 to describe the gap between AI expectations and the technology's shortcomings.
A brief resurgence in the early 1980s couldn't sustain the momentum, and the field continued to struggle.
It wasn't until the late 1990s that the field gained more R&D funding to make substantial leaps forward.
Breakthroughs and Advancements
LSTM, a neural network model, revolutionized speech recognition training in 1997, allowing it to learn tasks that require memory of events thousands of steps earlier.
In 2007, LSTM started outperforming traditional speech recognition programs, and by 2015, Google's speech recognition program had a significant performance jump of 49 percent using a CTC-trained LSTM.
The development of backpropagation in 1989 by Yann LeCun enabled the creation of practical neural networks that could read handwritten digits.
The support vector machine was developed in 1995 by Dana Cortes and Vladimir Vapnik, and it was a significant advancement in machine learning.
Faster processing power, made possible by GPUs, increased computational speeds by 1000 times over a 10-year span, allowing neural networks to compete with support vector machines.
The Generative Adversarial Neural Network (GAN) was introduced in 2014 by Ian Goodfellow, providing a way to perfect a product by having two neural networks play against each other in a game.
1969
In 1969, Arthur Bryson and Yu-Chi Ho described a backpropagation learning algorithm to enable multilayer ANNs, a significant advancement over the perceptron.
This breakthrough laid the foundation for deep learning, which would later revolutionize the field of artificial intelligence.
Marvin Minsky and Seymour Papert published Perceptrons, a book that highlighted the limitations of simple neural networks and led to a decline in neural network research.
As a result, symbolic AI research gained momentum, but it would take a few more years for neural networks to regain popularity.
DeepMind introduced deep reinforcement learning, a CNN that learned based on rewards and played games through repetition, eventually surpassing human expert levels.
See what others are reading: Is Transfer Learning Different than Deep Learning
The 80s and 90s
The 80s and 90s were a pivotal time for AI research, marked by significant breakthroughs and setbacks. In 1985, Terry Sejnowski created NetTalk, a program that learned to pronounce words like a baby.
Yann LeCun's 1989 demonstration of backpropagation at Bell Labs was a major milestone, as it combined convolutional neural networks with back propagation to read handwritten digits. This system was eventually used to read numbers on handwritten checks.
The second AI winter (1985-90s) kicked in, causing research on neural networks and deep learning to slow down due to overly-optimistic predictions and subsequent disappointment. Despite this, Dana Cortes and Vladimir Vapnik developed the support vector machine in 1995.
Sepp Hochreiter and Juergen Schmidhuber developed LSTM (long short-term memory) for recurrent neural networks in 1997. This was a crucial step forward, but it wasn't until 1999 that computers started becoming faster at processing data, thanks to the development of GPU (graphics processing units).
Faster processing with GPUs increased computational speeds by 1000 times over a 10-year span, making neural networks more competitive with support vector machines. Neural networks offered better results using the same data, and they continued to improve as more training data was added.
2006
In 2006, Netflix launched the Netflix Prize competition, a challenge to create a machine learning algorithm more accurate than their proprietary user recommendation software.
This competition aimed to improve the accuracy of movie recommendations, which was a significant goal for Netflix at the time.
IBM Watson
IBM Watson was created by IBM in 2011 to play the US quiz show Jeopardy. It was designed to receive natural language questions and respond accordingly.
Watson was fed data from encyclopedias and across the internet to prepare for the show. This extensive data intake allowed it to beat two of the show's most formidable all-time champions, Ken Jennings and Brad Rutter.
Speech Recognition
Speech recognition has made tremendous progress in recent years, thanks in part to a technique called long short-term memory (LSTM).
LSTM is a type of neural network model that was first described by Jürgen Schmidhuber and Sepp Hochreiter in 1997. It's capable of learning tasks that require memory of events thousands of steps earlier.
Around 2007, LSTM started outperforming traditional speech recognition programs, marking a significant shift in the field.
In 2015, the Google speech recognition program saw a 49 percent performance jump after being trained with a CTC-trained LSTM.
This breakthrough has led to more accurate and efficient speech recognition systems, which are now being used in a variety of applications.
Expand your knowledge: Machine Learning Facial Recognition Security and Surveillance Systems
Facial Recognition Becomes Reality
Facial recognition has come a long way since 2006. The Face Recognition Grand Challenge evaluated popular face recognition algorithms that year.
In 2006, the Face Recognition Grand Challenge – a National Institute of Standards and Technology program – tested 3D face scans, iris images, and high-resolution face images. Their findings were impressive, with new algorithms being ten times more accurate than those from 2002.
Some algorithms were able to outperform human participants in recognizing faces. This suggests that technology has surpassed human capabilities in certain areas.
These new algorithms were also 100 times more accurate than those from 1995. This rapid improvement in accuracy is a testament to the advancements in facial recognition technology.
Broaden your view: Generative Ai Human Creativity and Art Google Scholar
2000-2010
The year 2000 marked the beginning of a significant problem in neural networks - the Vanishing Gradient Problem. This issue arose because certain activation functions condensed their input, reducing the output range in a chaotic fashion, resulting in a vanishing gradient.
Around the same time, a research report by META Group (now Gartner) highlighted the challenges and opportunities of data growth as three-dimensional. The report described the increasing volume of data and speed as increasing the range of data sources and types.
In 2001, a report by META Group predicted the onslaught of Big Data, which would change the way machine learning works. The report emphasized that data drives learning.
Fei-Fei Li, an AI professor at Stanford, launched ImageNet in 2009, creating a free database of over 14 million labeled images. This was a crucial step in making labeled images available for training neural nets.
Professor Li's vision was that big data would revolutionize machine learning, and it did. The availability of labeled images enabled the development of more accurate neural networks.
Modern AI
The AI growth in the 2000s laid the foundation for the modern AI we see today. With renewed interest in AI beginning in 2000, the field experienced significant growth.
This growth was largely driven by the development of new AI systems that could learn from data and improve over time. By 2020, AI had reached a new level with the emergence of generative AI.
Generative AI has revolutionized the way AI systems interact with us, allowing them to generate text, images, and videos in response to text prompts. This ability to create new content has opened up new possibilities for AI applications.
AI Surge: 2020-Present
The AI surge in recent years has largely come about thanks to developments in generative AI——or the ability for AI to generate text, images, and videos in response to text prompts.
Generative AI continues to learn from materials like documents, photos, and more from across the internet. This is a significant departure from past systems that were coded to respond to a set inquiry.
One notable example of generative AI is DALL-E, an OpenAI creation released in 2021. DALL-E is a text-to-image model that responds to natural language text by generating realistic, editable images.
DALL-E's first iteration used a version of OpenAI's GPT-3 model and was trained on 12 billion parameters. This massive training dataset enables DALL-E to produce highly realistic images.
ChatGPT Released
ChatGPT was released in 2022 by OpenAI, and it's a game-changer. It's based on the GPT-3 foundation, which was trained on billions of inputs to improve its natural language processing abilities.
This means ChatGPT can interact with users in a far more realistic way than previous chatbots. It can even ask follow-up questions and recognize inappropriate prompts.
ChatGPT can be used for various tasks, such as helping with code or resume writing, beating writer's block, or conducting research. Users simply prompt ChatGPT for the desired response.
The large language model (LLM) behind ChatGPT, GPT-3, was trained on 175 billion parameters, which is a huge leap from the 1.5 billion parameters of its predecessor, GPT-2.
Curious to learn more? Check out: What Is the Hardest Code Language to Learn
2011-2020
By 2011, GPU speeds had increased significantly, making it possible to train convolutional neural networks without layer-by-layer pre-training.
This breakthrough led to a surge in deep learning's efficiency and speed, with AlexNet emerging as a top performer in international competitions during 2011 and 2012.
AlexNet's architecture was particularly noteworthy, using rectified linear units to enhance speed and dropout.
Rectified linear units were a game-changer, allowing for faster training times and better performance.
The Cat Experiment, released by Google Brain in 2012, was another significant milestone in the field of deep learning.
This unusual project explored the challenges of unsupervised learning, where a neural net is trained on unlabeled data and asked to find recurring patterns.
The Cat Experiment used a neural net spread across 1,000 computers, processing 10 million unlabeled images from YouTube.
At the end of the training, one neuron in the highest layer responded strongly to images of cats, and another neuron responded strongly to human faces.
Andrew Ng, the project's founder, noted the significance of this finding, highlighting the potential for unsupervised learning to revolutionize the field.
You might enjoy: Clustering Algorithms Unsupervised Learning
Frequently Asked Questions
Who is the founder of machine learning?
Alan Turing is considered the founder of machine learning, while Arthur Samuel is credited with coining the term in 1959. Turing's work laid the foundation for the field in the 1950s.
Who first invented machine learning?
Machine learning was founded by Alan Turing in the 1950s, with Arthur Samuel coining the term in 1959. Alan Turing is credited as the pioneer of machine learning, laying the groundwork for its development.
Sources
- convolutional neural networks (technologyreview.com)
- AlexNet (stanford.edu)
- The History of Machine Learning (cleveradviser.com)
- computer program (incompleteideas.net)
- Mark 1 perceptron (knoldus.com)
- Cover and Hart paper of 1967 (upenn.edu)
- backpropagation (neuralnetworksanddeeplearning.com)
- feedforward neural networks (ujjwalkarn.me)
- Boosting algorithms (analyticsvidhya.com)
- Eliza (njit.edu)
- AlphaGo (deepmind.google)
Featured Images: pexels.com