B.F. Skinner’s Theory of Operant Conditioning

in Psychology, Behavioral And Social Science

Learning is a process of gaining new and relatively persisting information or behaviours. There are many different types of learning theories. Key learning theories are classical conditioning, operant conditioning and social learning.

Source: Dusan Kostic/Adobe Stock

Classical Conditioning

Classical Conditioning is another useful and popular theory of learning. It was given by Ivan Pavlov. The theory states that learning can occur when an individual learns to associate two or more stimuli and can predict events based on that association.

Pavlov built a machine that would accurately determine the response of the dog while feeding. He rang a bell (neutral stimulus) before providing food to the dog. After several trials, the dog learnt to associate the ringing of a bell with food. This experiment was a pioneer for classical conditioning.

Social Learning Theory

Social learning theory is another useful and popular theory of learning. It was given by Albert Bandura. It states that individuals learn new behaviours through observing and imitating others.

For example, Isla watches Kim goes near a candle and burn his finger. She notices he is in pain. Isla avoids going too close to candles now. She has learnt from Kim’s actions and the consequences.

Social Learning Theory has 5 steps.

Firstly, an individual must observe a particular action.
He/she needs to pay attention to that particular action.
He/she also needs to retain the memory of that action.
He/she would then reproduce that action if there were enough motivation to do that.
If the observer watches an action and notices positive or neutral consequences, he or she is likely to repeat that action.

In the previous example, Isla watches Kim goes through pain because of the burn, hence, she was not motivated to repeat the action.

Operant Conditioning

B.F. Skinner was an influential human in the field of behaviourism. He conducted a study that displays operant conditioning.

For the experiment, he designed an operant chamber, which is popularly known as Skinner’s box. This box had a lever that an animal (rat) could press to gain a reward (food or water). There was also another instrument kept outside that recorded their responses. These instruments displayed the process of reinforcement, which means whether an action encourages a particular response.

Operant Conditioning and Learning

Operant conditioning, sometimes referred to as instrumental conditioning is a method of learning that occurs through punishments and rewards for behaviour. Individuals associate their behaviours with consequences. Actions or behaviours that are followed with praise or encouragement (reinforcement) tend to increase and actions or behaviours that are met with punishment or annoyance tend to decrease.

For example, when a child finishes his/her homework within a given time frame, and the parent praises the child for this behaviour, the child is reinforced to repeat these behaviours. When a child misbehaves in public, the parents refuse to get ice cream, thereby using punishment to decrease the undesirable behaviour.

Principles of Operant Conditioning

1. Reinforcement – It is an essential principle of operant conditioning. It is the process of strengthening behaviour through reinforcers.

2. Punishment – It is the process of decreasing the likelihood of repeating a previous behaviour.

3. Shaping – It is the process that includes reinforcers to guide individuals closer and closer to a desired behaviour.

Firstly, all behaviour that is related to the desired behaviour is reinforced by rewards. Later, only the behaviour that is very close to the desired behaviour is reinforced through rewards. And lastly, only the desired behaviour is rewarded. This process of successive approximations leads to a final desired behaviour.

For example, Ben is trying to teach German to a student. He firstly rewards the student when he introduces himself in German. Then, He rewards him for speaking in German for 30 minutes. And the student gets his final reward when he speaks in German fluently for an entire period of 1 hour.

Reinforcement

Types of Reinforcement

1. Positive Reinforcement is introducing a pleasant stimulus (rewards) to increase the repetition of that behaviour.

For example, the child is allowed to eat dessert (reward) for finishing the homework (behaviour).

2. Negative reinforcement is removing unpleasant (aversive) stimulus that tends to increase the likelihood of repetition of that behaviour.

For example, Alis cleans up his room (behaviour) to avoid his mother’s nagging (aversive stimulus).

3. Primary and Secondary Reinforcement– Primary reinforcer is something that satisfies or fulfils our biological, innate needs.

For example, getting food when you are very hungry.

A Secondary reinforcer is something that is associated with a primary reinforcer.

For example, James gets a paycheck (money, secondary reinforcer) for his services. He uses the money to buy a sweatshirt (clothing, which is a primary reinforcer).

4. Immediate and Delayed Reinforcers: Immediate reinforcement is a reward given immediately after an individual performs the desired behaviour.

For instance, giving a treat to a child just after he or she eats all the vegetables on the plate.

Delayed reinforcement is a reward that is not given immediately after an individual performs the desired behaviour. It is given at a later stage.

For instance, receiving a paycheck by the end of a month, or a trophy at the end of a sports session.

5. Continuous and Partial Reinforcement: Continuous reinforcement includes reinforcing/ rewarding a particular behaviour every single time it occurs.

For example, giving chocolate to a child every time he/she finishes vegetables off a plate. It’s a good technique since learning occurs fast here. However, extinction of that behaviour also occurs quickly here. If we stop giving chocolate to the child for eating vegetables, he or she is likely to stop eating the vegetables.

Continuous reinforcement is very unlikely in real life. For instance, a clerk in a company does not get an appreciation for every single work he does. But he keeps working because he does get occasional rewards and appreciation.

Partial or intermittent reinforcement includes reinforcing /rewarding a particular behaviour sometimes, not always. Learning can occur at a slower pace but the desired behaviour lasts longer. Gambling can be an example of partial reinforcement. You do not win every single time, but you do win sometimes.

Reinforcement Schedules

Reinforcement schedules are patterns or guides that determine how often a desirable response will be reinforced. Skinner and his colleagues developed 4 partial reinforcement schedules. Some of these are fixed and some are variable.

Reinforcement schedules are important for the process of learning. Both variable and fixed schedules are important to reinforce behaviour as both of them can have different impacts on an individual. These schedules can mould, shape and strengthen desired behaviour.

1. Fixed-Ratio Schedule: This schedule includes reinforcements that are given only after a specific number of responses.

For example, an individual gets a shopping store discount after buying 5 items.

2. Variable-Ratio Schedule: This schedule includes reinforcements that are given after a random number of responses.

For example, gambling and lottery games make you win quite unpredictably.

3. Fixed-interval schedule: This schedule includes reinforcements that are given only after a specific amount of time has passed.

For example, a weekly feedback session from an employer could serve as a reinforcer for the employees to work hard.

4. Variable-interval schedule: This schedule includes reinforcements that are given after an unpredictable amount of time has passed.

For example, teachers ask pop quizzes at any time of the month. This would reinforce the students to study and revise appropriately.

Punishment

1. Positive punishment: Positive punishment is introducing an aversive stimulus (unpleasant stimulus) to decrease the behaviour, and it follows.

For example, Amol is speeding on the road and gets a traffic ticket for speeding.

2. Negative Punishment: Negative punishment works to decrease behaviour by withdrawing or removing something desirable because of undesirable behaviour.

For example, Amol’s parents find out about his speeding. They take away his driving privileges for some time.

Applications of Operant Conditioning

There has been a tremendous amount of research on operant conditioning. It has a lot of applications in various fields.

Schools use operant conditioning in the form of instructions, exams, giving feedback, medals, and trophies.
Operant conditioning is also used to train athletic performance. Small victories are reinforced, leading to greater challenges and successes.
Workplaces also employ operant conditioning by providing rewards for productive, specific performances that are achievable.
Parents also use operant conditioning to shape their children’s values and manners. Giving them treats and encouragement reinforces desired behaviours. We can also use operant conditioning on selves. For example, to increase your study hours, give yourself a reward (watching an episode, eating ice cream) only after you study for a specific amount of time.