In the last section, we examined
an overview of operant conditioning. In this section, we begin with the
discussion of one major aspect of operant conditioning called reinforcement.
So let us begin a discussion of reinforcement by a few things from last time
that relates to the work of B. F. Skinner. As we saw last time, Skinner
systematically demonstrated a couple major things. Number one, if something
followed some response and the behavior increased, the procedure was called
reinforcement and the things that were related to reinforcement were called
reinforcers. So the procedure was called reinforcement and the things that
caused the changes were reinforcers.
In addition to that, as we see on slide three, if
something occurred after the response and the behavior decreased, the
procedure was called punishment and the thing that actually caused the
decrease was called the punisher.
So as we show again in slide four, reinforcers always
increase the behavior and punishers always decrease the behavior. And as I
stated several times, there are no exceptions to that rule.
Now, as we talked about right at the end of last time, and
as we see here in slide five, there were two types of reinforcers and
punishers. Basically is it related to whether you added or removed
something. If you added something followed a response, that was considered
to be positive, and if you remove something followed that response, that was
considered to be negative. So, as we saw from the last section, positive
does not mean good and negative does not mean bad, they relate to whether
you’re adding or removing something following some kind of response.
So let’s give an example of positive reinforcement. This
is shown in slide six. If you add something following a response and the
behavior increases, that is that is called positive reinforcement. The
classic example of a goodie is a chocolate chip cookie. So when a little kid
comes up and gives his mom a big hug and a kiss and he gets a chocolate chip
cookie, guess what happens. Pretty soon the little kid is giving moms hugs
and kisses all the time and getting lots and lots and lots of chocolate chip
cookies. So in essence, mom is positively reinforcing her child with a
chocolate chip cookie.
Now the key for Skinner and one of the major key points is
how do you get the behavior to occur the first time? This is shown in slide
seven. In essence, to get behavior to occur the first time, what we need to
use is a procedure called shaping by success of approximations, or what is
also called shaping. What you do is you reinforce successive approximations
to a desired response, that is, to get a rat to bar press. So what is an
example of getting a rat to bar press? Well when you throw the rat in the
cage, often they’ll just wander around for an hour or so and won’t even
press the bar at all. It doesn’t have any idea of what the bar is, it will
just wander in there. So if you want to make the procedure go much faster,
you give the rat reinforcement when it gets close to the bar, until then it
finally touches the bar, and on and on. We’ll provide more detailed examples
a little bit later.
Now the key point with shaping (as we see in slide eight),
is this concept. That is, you must deprive the organism of what you wanted
to reinforce it with. That is, if you have everything, lets say that you’re
trying to bribe Bill Gates with money. Well you probably couldn’t do that
because Bill Gates has 37 billion dollars or so, it’s pretty hard to bribe
Bill Gates with $100. But on the other hand, you could deprive Bill Gates of
food. This often is done (get an animal hungry) before placing the animal in
a learning environment. An example of this is depriving the animal to 80% of
it’s body weight When it’s hungry, it is much more willing to learn and do a
lot more things.
Now kids work exactly the same way and so do adults. So
what you want to do if you’re going to reinforce kids is deprive the kid of
the reinforcer that you want to use. This is shown in slide nine. So, for
example, if kids haven’t had chocolate chip cookies for a long time, and if
you wanted to use those as reinforcers, they’re more likely to make the
particular responses that you want.
In addition to that, what you want to do is use the
chocolate chip cookies for small appropriate behaviors, and small
approximate behaviors to the final desired response. For example lets say
you want the kid to pick up all of its room and all of the clothes in its
room. Well what you might start with is reinforcing them for when they just
go into the room. When they’re doing that at a high steady rate, give them a
reinforcer when they pick up a piece of clothing in the room, then two
pieces of clothing and on and on until ultimately they have all their
clothes picked up. Then you could also start working with the rest of the
room as well. So in general, what you do is you reinforce for small,
appropriate and small approximate behaviors to the ultimate final desired
response. That is, a clean room. Gradually what you do is increase the
amount of behavior that you want to reinforce before you give it the
chocolate chip cookie.
Now in addition to regular shaping, there’s another type
of shaping procedure and that is shown in slide 10 and called reverse
shaping. Reverse shaping is a lot different from regular shaping. In
essence, what you do is you start at the end response and reinforce it, then
you take two steps backward and reinforce it, then three steps. The classic
examples of reverse shaping an organism or reverse shaping procedure are
shown in slide 11. That is, the Stuart Little, the mouse, and teaching a kid
to tie their shoes.
Lets start with teaching the kid to tie their shoes first.
If you’re going to use the reverse shaping procedure, you first look at all
the things that a person has to do to tie their shoes. Well the first thing
they have to do is cross one strings over and put one underneath. Then they
have to make a loop, then they have to take the other string and wrap it
around the loop, then pull it through, and then finally pull the two bows
together. In reverse shaping, what you do first, is you start at the very
end response. So you would have them pull the two bows together and when the
kid can do that and do that well, then you do the next step, that is you
wrap one string around the one. That is, you take one strand and wrap it
around the loop and pull it through and then pull them both tight. So, in
essence just putting the rope through the hole. Then making a bow and
pulling them tight, and then wrapping it around the bow and then putting it
through the hole, and then pulling it tight and on and on and on, until the
kid goes all the way back to the original starting point of tying their
shoes. That is, crossing the two strings over and putting one underneath the
other. So in essence what you do is you start at the end response and you
work back toward the beginning.
Stuart Little, the mouse, is another classic example. This
technique is used a lot in animal training where you want an organism to do
a lot of different things. Again, you start at the end response. Now Stuart
Little was a mouse and what the director wanted the mouse to do was run
around and do all sorts of things, go over tables, go under chairs, go
behind doors and ultimately get up in his bed and lay down and go to sleep.
Ok, so that’s exactly what we start with.
We start is at the end response, so what we do is we
deprive Stuart Little of the food and put a little bit of food in Stuart
Little’s bed and what does Stuart Little do, Stuart Little gets up in his
bed and eats the food. Then what we do is we go one step further. We put
Stuart Little on the far end of the counter, we put some food in his bed and
then we let him go and he runs across the counter, gets up on his bed and
eats the food and lays down. And then what we do is we go back one step
further. We have a little ramp that Stuart Little needs to run up. So Stuart
Little runs up the ramp, runs across the desk gets in his bed, and eats the
food, and on and on and on, until we have Stuart Little at the other end of
the house, running across the house, going over this and that, jumping up
and down and running through, driving cars, whatever it may be, until
finally he gets up and gets on his bed and lays down. This is a classic
example of how to train an organism. Again, we start at the end response and
work back to the beginning. In essence that gives a wide variety of
different behaviors that the organism must go through.
So again, the key to shaping is depriving the organism.
And whether you use a regular shaping procedure where you reinforce for
behavior prior to the particular ultimate response or starting at the end
response. The key is you reinforce closer and closer approximations to the
final desired behavior.
Now the next concept that we want to talk about in
relation to operant conditioning and in reinforcement relates to what we
call negative reinforcement. This is shown in slide 12. Basically, in
negative reinforcement what you do is you remove something following a
response and the behavior increases. So again, what we’re doing is removing
something, (ala the negative stimulus), and the behavior increase. So,
removing something following the response and causing the behavior to go up
is negative reinforcement.
Now there are two types of negative reinforcement. The
first type is shown in slide 13 and is called escape learning and it’s
really straightforward and simple. If you escape from something that’s
aversive, the next time that you are in the same situation, you will make
the same response. So how does that work. Well, let’s use the classic
example in relation to spousal abuse. This example is shown in slide 14.
Let’s say that the husband and wife are having an argument
in the kitchen, and the wife is yelling and really yelling loud at her
husband and on and on and on. Now, that is an extremely aversive stimulus to
the husband. So, the husband smacks the wife. As a result of that, the wife
stops yelling immediately. Consequently because the husband has removed
something that’s aversive to him, in the next situation he will do the same
thing again. That is, the husband has been negatively reinforced. He stopped
the wife from yelling and so in the next time, when the husband’s in a
similar situation, the husband will hit the spouse again. So, if we
concentrate only on feelings and nothing else (No I won’t do it, no I won’t
do it, I’ll control my temper, on and on and on) it won’t work. We have to
control the behavior because the husband was reinforced for a particular
behavior and negative reinforcements are very, very powerful.
The second example of negative reinforcing stimuli and
negative reinforcement is what we call avoidance and that is shown beginning
on slide 15. To avoid something aversive, you basically are avoiding
something aversive. The classic example goes something like this and is
shown in slide 16. You have a little kid in a candy store, and the little
kid wants a candy bar. So the kid is by the counter and it’s saying “Mommy
can I have a candy bar, I want a candy bar, please can I have a candy bar,”
on and on. “I want a candy bar.” When mom says, “No you can’t have a candy
bar,” kid starts to get more persistent. It starts to get more obnoxious, “I
want a candy bar, give me a candy bar, I want a candy bar, please give me a
candy bar, please, please, please.” As a result, the parent becomes
extremely embarrassed and upset and it gives the kid a candy bar. As a
result, the kid stops yelling, and as a result of that, the parent is
negatively reinforced. They escape from the aversive stimulus. There is no
more yelling. So, what will happen the next time when the kid is starting to
yell in the aisle? What’s the parent going to do? It’s going to stuff a
candy bar in its mouth.
Now how does that relate to an avoidance type of
situation. Well instead of waiting for the kid to start yelling about having
a candy bar when it gets into the store or right by the aisle, the next time
the parent comes up to the counter it just gives the kid the candy bar. So,
what the parent does is avoid the major scene that it’s had before.
Now you need to know (as we see in slide 17) that the kid
has been positively reinforced for it’s behavior. So what does the kid learn
from all of this. When I make some particular kind of response, ala yell and
scream, in the store, I get what I want, I get the candy bar. So what are
you going to see with the kid’s behavior over time in the store? Every time
it wants something, it’s gonna start yelling and screaming in the store
because it has been positively reinforced for it’s behavior while the parent
has been negatively reinforced for their behavior.
So this section, we again have reviewed a variety of
aspects of reinforcement. In the next section, we’re going to continue with
this discussion and provide more information and discuss some variables that
affect reinforcement.
Back