PROFESSIONAL DOG TRAINING "FOR THE REAL WORLD"

"We can train ANY dog!"

K9 Learning Theory

This is a "MUST READ" for anyone who is currently training or thinking about training a puppy or dog.

 "Training Tools of Choice!"

Classical and Operant and Conditioning

An Introduction to Classical (Respondent) Conditoning

Developed by: W. Huitt and J. Hummel
Last Revised: May, 1997

Citation: Huitt, W., & Hummel, J. (1997). An introduction to classical (respondent) conditioning. Educational Psychology Interactive. Valdosta, GA: Valdosta State University. Retrieved [date], from



Call Now For FREE !!! 480-502-DOGS (3647) FREE !!! Training Advice
or 
e-mail us Now! 


Classical conditioning was the first type of learning to be discovered and studied within the behaviorist tradition (hence the name classical). The major theorist in the development of classical conditioning is Ivan Pavlov, a Russian scientist trained in biology and medicine (as was his contemporary, Sigmund Freud). Pavlov was studying the digestive system of dogs and became intrigued with his observation that dogs deprived of food began to salivate when one of his assistants walked into the room. He began to investigate this phenomena and established the laws of classical conditioning. Skinner renamed this type of learning "respondent conditioning" since in this type of learning, one is responding to an environmental antecedent.

Major concepts

Classical conditioning is Stimulus (S) elicits >Response (R) conditioning since the antecedent stimulus (singular) causes (elicits) the reflexive or involuntary response to occur. Classical conditioning starts with a reflex: an innate, involuntary behavior elicited or caused by an antecedent environmental event. For example, if air is blown into your eye, you blink. You have no voluntary or conscious control over whether the blink occurs or not.

The specific model for classical conditioning is:

  1. Unconditioned Stimulus (US) elicits > Unconditioned Response (UR): a stimulus will naturally (without learning) elicit or bring about a relexive response
  2. Neutral Stimulus (NS) ---> does not elicit the response of interest: this stimulus (sometimes called an orienting stimulus as it elicits an orienting response) is a neutral stimulus since it does not elicit the Unconditioned (or reflexive) Response.
  3. The Neutral/Orientiing Stimulus (NS) is repeatedly paired with the Unconditioned/Natural Stimulus (US).
  4. The NS is transformed into a Conditioned Stimulus (CS); that is, when the CS is presented by itself, it elicits or causes the CR (which is the same involuntary response as the UR; the name changes because it is elicited by a different stimulus. This is written CS elicits > CR.

In classical conditioning no new behaviors are learned. Instead, an association is developed (through pairing) between the NS and the US so that the animal / person responds to both events / stimuli (plural) in the same way; restated, after conditioning, both the US and the CS will elicit the same involuntary response (the person / animal learns to respond reflexively to a new stimulus).

The following is a restatement of these basic principles using figures of Pavlov's original experiments as an example.

Before conditioning

In order to have classical or respondent conditioning, there must exist a stimulus that will automatically or reflexively elicit a specific response. This stimulus is called the Unconditioned Stimulus or UCS because there is no learning involved in connecting the stimulus and response. There must also be a stimulus that will not elicit this specific response, but will elicit an orienting response. This stimulus is called a Neutral Stimulus or an Orienting Stimulus.

During conditioning

During conditioning, the neutral stimulus will first be presented, followed by the unconditioned stimulus. Over time, the learner will develop an association between these two stimuli (i.e., will learn to make a connection between the two stimuli.)

After conditioning

After conditioning, the previously neutral or orienting stimulus will elicit the response previously only elicited by the unconditioned stimulus. The stimulus is now called a conditioned stimulus because it will now elicit a different response as a result of conditioning or learning. The response is now called a conditioned response because it is elicited by a stimulus as a result of learning. The two responses, unconditioned and conditioned, look the same, but they are elicited by different stimuli and are therefore given different labels.

In the area of classroom learning, classical conditioning primarily influences emotional behavior. Things that make us happy, sad, angry, etc. become associated with neutral stimuli that gain our attention. For example, if a particular academic subject or remembering a particular teacher produces emotional feelings in you, those emotions are probably a result of classical conditioning.


An Introduction to Operant (Instrumental) Conditioning

Citation: Huitt, W., & Hummel, J. (1997). An introduction to operant (instrumental) conditioning. Educational Psychology Interactive. Valdosta, GA: Valdosta State University. Retrieved [date] from, http://chiron.valdosta.edu/whuitt/col/behsys/operant.html.



Call Now For FREE !!! 480-502-DOGS (3647) FREE !!! Training Advice
or
e-mail us Now!


A human being fashions his consequences as surely as he fashions his goods or his dwelling. Nothing that he says, thinks or does is without consequences.
- Norman Cousins,
 20th century editor and author

The major theorists for the development of operant conditioning are Edward Thorndike, John Watson, and B. F. Skinner. This approach to behaviorism played a major role in the development of the science of psychology, especially in the United States. They proposed that learning is the result of the application of consequences; that is, learners begin to connect certain responses with certain stimuli. This connection causes the probability of the response to change (i.e., learning occurs.)

Thorndike labeled this type of learning instrumental. Using consequences, he taught kittens to manipulate a latch (e.g., an instrument). Skinner renamed instrumental as operant because it is more descriptive (i.e., in this learning, one is "operating" on, and is influenced by, the environment). Where classical conditioning illustrates S-->R learning, operant conditioning is often viewed as R-->S learning since it is the consequence that follows the response that influences whether the response is likely or unlikely to occur again. It is through operant conditioning that voluntary responses are learned.

The 3-term model of operant conditioning (S--> R -->S) incorporates the concept that responses cannot occur without an environmental event (e.g., an antecedent stimulus) preceding it. While the antecedent stimulus in operant conditioning does not elicit or cause the response (as it does in classical), it can influence it. When the antecedent does influence the likelihood of a response occurring, it is technically called a discriminative stimulus.

It is the stimulus that follows a voluntary response (i.e., the response's consequence) that changes the probability of whether the response is likely or unlikely to occur again. There are two types of consequences: positive (sometimes called pleasant) and negative (sometimes called aversive). These can be added to or taken away from the environment in order to change the probability of a given response occurring again.

General Principles

There are 4 major techniques or methods used in operant conditioning. They result from combining the two major purposes of operant conditioning (increasing or decreasing the probability that a specific behavior will occur in the future), the types of stimuli used (positive/pleasant or negative/aversive), and the action taken (adding or removing the stimulus).

 

Outcome of Conditioning

 

Increase Behavior

Decrease Behavior

Positive
Stimulus

Positive
Reinforcement
 

(add stimulus)

Response Cost

(remove stimulus)

Negative
Stimulus

Negative
Reinforcement
 

(remove stimulus)

Punishment

(add stimulus)

Schedules of consequences

Stimuli are presented in the environment according to a schedule of which there are two basic categories: continuous and intermittent. Continuous reinforcement simply means that the behavior is followed by a consequence each time it occurs. Intermittent schedules are based either on the passage of time (interval schedules) or the number of correct responses emitted (ratio schedules). The consequence can be delivered based on the same amount of passage of time or the same number of correct responses (fixed) or it could be based on a slightly different amount of time or number of correct responses that vary around a particular number (variable). This results in an four classes of intermittent schedules. [Note: Continuous reinforcement is actually a specific example of a fixed ratio schedule with only one response emitted before a consequence occurs.]

1. Fixed interval -- the first correct response after a set amount of time has passed is reinforced (i.e., a consequence is delivered). The time period required is always the same.

Notice that in the context of positive reinforcement, this schedule produces a scalloping effect during learning (a dramatic drop-off of responding immediately after reinforcement.) Also notice the number of behaviors observed in a 30 minute time period.

2. Variable interval -- the first correct response after a set amount of time has passed is reinforced. After the reinforcement, a new time period (shorter or longer) is set with the average equaling a specific number over a sum total of trials.

Notice that this schedule reduces the scalloping effect and the number of behaviors observed in the 30-minute time period is slightly increased.

3. Fixed ratio -- a reinforcer is given after a specified number of correct responses. This schedule is best for learning a new behavior

Notice that behavior is relatively stable between reinforcements, with a slight delay after reinforcement is given. Also notice the number of behaviors observed during the 30-minute time period is larger than that seen under either of the interval schedules.

4. Variable ratio -- a reinforcer is given after a set number of correct responses. After reinforcement the number of correct responses necessary for reinforcement changes. This schedule is best for maintaining behavior.

Notice that the number of responses per time period increase as the schedule of reinforcement is changes from

fixed interval and from fixed ratio to variable ratio.

In summary, the schedules of consequences are often called schedules of reinforcements because there is only one schedule that is appropriate for administering response cost and punishment: continuous or fixed ratio of one. In fact, certainty of the application of a consequence is the most important aspect of using response cost and punishment. Learners must know, without a doubt, that an undesired or inappropriate target behavior will be followed by removal of a positive/pleasant stimulus or the addition of a negative/aversive stimulus. Using an intermittent schedule when one is attempting to reduce a behavior may actually lead to a strengthening of the behavior, certainly an unwanted end result.

Premack Principle

The Premack Principle, often called "grandma's rule," states that a high frequency activity can be used to reinforce low frequency behavior. Access to the preferred activity is contingent on completing the low-frequency behavior. The high frequency behavior to use as a reinforcer can be determined by:

1.        Asking students what they would like to do;

2.        Observing students during their free time; or

3.        Determing what might be expected behavior for a particular age group. 

Analyzing Examples of Operant Conditioning

There are five basic processes in operant conditioning: positive and negative reinforcement strengthen behavior; punishment, response cost, and extinction weaken behavior.

1.        Positive Reinforcement--the term reinforcement always indicates a process that strengthens a behavior; the word positive has two cues associated with it. First, a positive or pleasant stimulus is used in the process, and second, the reinforcer is added (i.e., "positive" as in + sign for addition). In positive reinforcement, a positive reinforcer is added after a response and increases the frequency of the response.

2.        Negative Reinforcement-- the term reinforcement always indicates a process that strengthens a behavior; the word negative has two cues associated with it. First, a negative or aversive stimulus is used in the process, and second, the reinforcer is subtracted (i.e., "negative" as in a "-" sign for subtraction). In negative reinforcement, after the response the negative reinforcer is removed which increases the frequency of the response. (Note: There are two types of negative reinforcement: escape and avoidance. In general, the learner must first learn to escape before he or she learns to avoid.)

3.        Response Cost--if positive reinforcement strengthens a response by adding a positive stimulus, then response cost has to weaken a behavior by subtracting a positive stimulus. After the response the positive reinforcer is removed which weakens the frequency of the response.

4.        Punishment--if negative reinforcement strengthens a behavior by subtracting a negative stimulus, than punishment has to weaken a behavior by adding a negative stimulus. After a response a negative or aversive stimulus is added which weakens the frequency of the response.

5.        Extinction--No longer reinforcing a previously reinforced response (using either positive or negative reinforcement) results in the weakening of the frequency of the response. 

Rules in analyzing examples. The following questions can help in determining whether operant conditioning has occurred.

a. What behavior in the example was increased or decreased?

b. Was the behavior increased (if yes, the process has the be either positive or negative reinforcement), or decreased (if the behavior was decreased the process is either response cost or punishment).

c. What was the consequence / stimulus that followed the behavior in the example?

d. Was the consequence / stimulus added or removed? If added the process was either positive reinforcement or punishment. If it was subtracted, the process was either negative reinforcement or response cost.

Examples. The following examples are provided to assist you in analyzing examples of operant conditioning.

a. Billy likes to campout in the backyard. He camped-out on every Friday during the month of June. The last time he camped out, some older kids snuck up to his tent while he was sleeping and threw a bucket of cold water on him. Billy has not camped-out for three weeks.

l. What behavior was changed? camping-out

2. Was the behavior strengthened or weakened? weakened (eliminate positive and negative reinforcement)

3. What was the consequence? having water thrown on him

4. Was the consequence added or subtracted? added

Since a consequence was added and the behavior was weakened, the process was punishment.

b. Every time Madge raises her hand in class she is called on. She raised her hand 3 time during the first class, 3 times in the second and 4 times during the last class.

l. What behavior was changed? hand raising

2. Was the behavior strengthened or weakened? strengthened (eliminates response cost, punishment, and extinction)

3. What was the consequence? being called on

4. Was the consequence added or subtracted? added

Since the consequence was added and the behavior was strengthened, the process is positive reinforcement.

c. Gregory is being reinforced using a token economy. When he follows a direction / command he earns a point. At the end of each day, he can "buy" free time, TV. privileges, etc. with his points. When he misbehaves or doesn't follow a command, he loses points. Andrew used to call his mom names. Since he has been on the point system, his name calling has been reduced to almost zero.

l. What behavior was changed? name calling

2. Was the behavior strengthened or weakened? weakened (eliminate positive and negative reinforcement)

3. What was the consequence? losing points

4. Was the consequence added or subtracted? subtracted

Since the consequence was subtracted and the behavior was weakened, the process is response cost.

d. John does not go to the dentist every 6-months for a checkup. Instead, he waited until a tooth really hurts, then goes to the dentist. After two emergency trips to the dentist, John now goes every 6-months.

1. What behavior was changed? going to the dentist

2. Was the behavior strengthened or weakened? strengthened (eliminate response cost and punishment)

3. What was the consequence? tooth no longer hurting

4. Was the consequence added or subtracted? subtracted

Since the consequence was subtracted and the behavior was strengthened, the process is negative reinforcement.

Applications of Operant Conditioning to Education:

Our knowledge about operant conditioning has greatly influenced educational practices. Children at all ages exhibit behavior. Teachers and parents are, by definition, behavior modifiers (if a child is behaviorally the same at the end of the academic year, you will not have done your job as a teacher; children are supposed to learn (i.e., produce relatively permanent change in behavior or behavior potential) as a result of the experiences they have in the school / classroom setting.

Behavioral studies in classroom settings have clearly established ways to organize and arrange the physical classroom to facilitate both academic and social behavior. Teaching itself has also been the focus of numerous studies, and has resulted in a variety of teaching models for educators at all levels. Programmed instruction is only one such model. Programmed instruction requires that learning be done in small steps, with the learner being an active participant (rather than passive), and that immediate corrective feedback is provided at each step.


Tutorials:


"We can train ANY dog!"

FREE K9 Telephone Advice -
CALL NOW! 480-502-DOGS (3647) 

Or
e-mail us Now!
 
< BACK  NEXT > 
TLC K9 ACADEMY PO BOX 71297 PHOENIX, AZ  85050-1005
 K9 Learning Theory

Home | Contact Us | Puppy Training | Dog Training Scottsdale | Dog Obedience Training | In Home Private Training
How to Choose a Dog Trainer | Scottsdale Dog Boot Camp | Behavior Problems | Guard Dog Training | Dog Behavior Correction
Magic Touch Training System | Training Challenge | Dog Learning Theory | In Home Puppy Training | e-dog News
Puppy House Training | Behavioral Counseling | Links to Dog Education Sites

Copyright © 2007, TLC K9 ACADEMY LLC, all rights reserved. All content and images are copyright protected by law and nothing is to be copied, reproduced, or distributed in any way without the written consent of TLC K9 ACADEMY LLC  send inquiries to info@tlck9.com Pictures Courtesy of "The Pet Professor".