SAS Enterprise Miner

User : Chris
Date : 12MAY2003:08:51:55
Notes:


"EM Workspace" :

EM Workspace


WORK.PROBMODEL

Input Data Settings:


  • All variables

  • Interval Variables

  • Class Variables

  • Notes: not available


    Data Partition

  • Partition Settings

  • Output

  • Log

  • Training Code

  • Notes: not available


    Regression

  • Parameters:

  • Fit Statistics
     
     Fit Statistic                         Training      Validation            Test 
      
     Akaike's Information Criterion    1665.4330624               .               . 
     Average Squared Error             0.1987311718    0.1988206347     0.204111031 
     Average Error Function            0.5770481258    0.5752822719    0.5925783353 
     Degrees of Freedom for Error              1413               .               . 
     Model Degrees of Freedom                    11               .               . 
     Total Degrees of Freedom                  1424               .               . 
     Divisor for ASE                           2848            2150            2158 
     Error Function                    1643.4330624    1236.8568845    1278.7840476 
     Final Prediction Error            0.2018253585               .               . 
     Maximum Absolute Error            0.9807931273    0.9513083205    0.9825489928 
     Mean Square Error                 0.2002782651    0.1988206347     0.204111031 
     Sum of Frequencies                        1424            1075            1079 
     Number of Estimate Weights                  11               .               . 
     Root Average Sum of Squares       0.4457927453    0.4458930754    0.4517864882 
     Root Final Prediction Error       0.4492497729               .               . 
     Root Mean Squared Error           0.4475245972    0.4458930754    0.4517864882 
     Schwarz's Bayesian Criterion      1723.3065384               .               . 
     Sum of Squared Errors             565.98637725    427.46436458    440.47160481 
     Sum of Case Weights Times Freq            2848            2150            2158 
     Misclassification Rate            0.3139044944     0.328372093    0.3160333642 
     Total Profit for BAD                       452             342             361 
     Average Profit for BAD            0.3174157303    0.3181395349    0.3345690454 
      

  • Target Information:

  • Regression Settings:

  • Output

  • Log

  • Training Code

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Transform Variables

  • Interval Variables and Transformations

  • Notes: not available


    Regression

  • Parameters:

  • Fit Statistics
     
     Fit Statistic                         Training      Validation            Test 
      
     Akaike's Information Criterion    1702.3402228               .               . 
     Average Squared Error             0.1884140763      0.20629268    0.2098287231 
     Average Error Function            0.5513835052    0.5912328542    0.6029404376 
     Degrees of Freedom for Error              1358               .               . 
     Model Degrees of Freedom                    66               .               . 
     Total Degrees of Freedom                  1424               .               . 
     Divisor for ASE                           2848            2150            2158 
     Error Function                    1570.3402228    1271.1506365    1301.1454642 
     Final Prediction Error            0.2067282575               .               . 
     Maximum Absolute Error            0.9993101248    0.9758885141    0.9782838827 
     Mean Square Error                 0.1975711669      0.20629268    0.2098287231 
     Sum of Frequencies                        1424            1075            1079 
     Number of Estimate Weights                  66               .               . 
     Root Average Sum of Squares       0.4340669031    0.4541945398    0.4580706529 
     Root Final Prediction Error       0.4546737925               .               . 
     Root Mean Squared Error           0.4444897827    0.4541945398    0.4580706529 
     Schwarz's Bayesian Criterion      2049.5810789               .               . 
     Sum of Squared Errors             536.60328941    443.52926194    452.81038442 
     Sum of Case Weights Times Freq            2848            2150            2158 
     Misclassification Rate            0.2914325843     0.343255814    0.3392029657 
     Total Profit for BAD                       452             342             361 
     Average Profit for BAD            0.3174157303    0.3181395349    0.3345690454 
      

  • Target Information:

  • Regression Settings:

  • Output

  • Log

  • Training Code

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Regression

  • Parameters:

  • Fit Statistics
     
     Fit Statistic                         Training      Validation            Test 
      
     Akaike's Information Criterion    1644.4959589               .               . 
     Average Squared Error              0.195686321    0.1965345493    0.2057414868 
     Average Error Function             0.566887626    0.5672732881    0.5912031272 
     Degrees of Freedom for Error              1409               .               . 
     Model Degrees of Freedom                    15               .               . 
     Total Degrees of Freedom                  1424               .               . 
     Divisor for ASE                           2848            2150            2158 
     Error Function                    1614.4959589    1219.6375694    1275.8163484 
     Final Prediction Error            0.1998528147               .               . 
     Maximum Absolute Error            0.9980081472    0.9628759504    0.9627875788 
     Mean Square Error                 0.1977695679    0.1965345493    0.2057414868 
     Sum of Frequencies                        1424            1075            1079 
     Number of Estimate Weights                  15               .               . 
     Root Average Sum of Squares       0.4423644663    0.4433221732     0.453587353 
     Root Final Prediction Error       0.4470490071               .               . 
     Root Mean Squared Error            0.444712905    0.4433221732     0.453587353 
     Schwarz's Bayesian Criterion      1723.4143352               .               . 
     Sum of Squared Errors             557.31464223    422.54928096    443.99012851 
     Sum of Case Weights Times Freq            2848            2150            2158 
     Misclassification Rate            0.3174157303            0.32    0.3271547729 
     Total Profit for BAD                       452             342             361 
     Average Profit for BAD            0.3174157303    0.3181395349    0.3345690454 
      

  • Target Information:

  • Regression Settings:

  • Output

  • Log

  • Training Code

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Replacement

    Imputation tables:

  • Replacement Settings:

  • Output

  • Log

  • Score Code

  • Notes: not available


    Transform Variables

  • Interval Variables and Transformations

  • Notes: not available


    Regression

  • Parameters:

  • Fit Statistics
     
     Fit Statistic                         Training      Validation            Test 
      
     Akaike's Information Criterion    1629.9090278               .               . 
     Average Squared Error             0.1970110346    0.1951690659    0.2031285138 
     Average Error Function            0.5701927766     0.563726985    0.5848042941 
     Degrees of Freedom for Error              1421               .               . 
     Model Degrees of Freedom                     3               .               . 
     Total Degrees of Freedom                  1424               .               . 
     Divisor for ASE                           2848            2150            2158 
     Error Function                    1623.9090278    1212.0130178    1262.0076667 
     Final Prediction Error            0.1978428897               .               . 
     Maximum Absolute Error            0.9961446637    0.9638925947    0.9630574649 
     Mean Square Error                 0.1974269621    0.1951690659    0.2031285138 
     Sum of Frequencies                        1424            1075            1079 
     Number of Estimate Weights                   3               .               . 
     Root Average Sum of Squares       0.4438592508    0.4417794312    0.4506978077 
     Root Final Prediction Error       0.4447953347               .               . 
     Root Mean Squared Error           0.4443275393    0.4417794312    0.4506978077 
     Schwarz's Bayesian Criterion       1645.692703               .               . 
     Sum of Squared Errors             561.08742643    419.61349161    438.35133286 
     Sum of Case Weights Times Freq            2848            2150            2158 
     Misclassification Rate            0.3244382022    0.3227906977    0.3225208526 
     Total Profit for BAD                       452             342             361 
     Average Profit for BAD            0.3174157303    0.3181395349    0.3345690454 
      

  • Target Information:

  • Regression Settings:

  • Output

  • Log

  • Training Code

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Neural Network

    Optimization plot:

    Optimization

     
     Fit Statistic                     Training    Validation       Test 
      
     [ TARGET=BAD ]                         .            .           . 
     Average Profit                        0.27         0.27        0.29 
     Misclassification Rate                0.27         0.28        0.29 
     Average Error                         0.53         0.54        0.57 
     Average Squared Error                 0.18         0.19        0.19 
     Sum of Squared Errors               606.46       465.14      486.79 
     Root Average Squared Error            0.43         0.43        0.44 
     Root Final Prediction Error           0.45          .           . 
     Root Mean Squared Error               0.44         0.43        0.44 
     Error Function                     1785.49      1364.36     1426.57 
     Mean Squared Error                    0.19         0.19        0.19 
     Maximum Absolute Error                0.98         0.95        0.97 
     Final Prediction Error                0.20          .           . 
     Divisor for ASE                    3348.00      2510.00     2510.00 
     Model Degrees of Freedom             96.00          .           . 
     Degrees of Freedom for Error       1578.00          .           . 
     Total Degrees of Freedom           1674.00          .           . 
     Sum of Frequencies                 1674.00      1255.00     1255.00 
     Sum Case Weights * Frequencies     3348.00      2510.00     2510.00 
     Akaike's Information Criterion     1977.49          .           . 
     Schwarz's Baysian Criterion        2498.09          .           . 
      
  • Network settings

  • Variables

  • Output

  • Log

  • Training Code

  • Score Code

    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)


    Tree

    Model assessment plot:

    SAS Graphics

     
     Fit Statistic                     Training    Validation       Test 
      
     Average Squared Error                 0.22         0.22        0.22 
     Sum of Squared Errors               617.06       466.39      481.08 
     Root Average Squared Error            0.47         0.47        0.47 
     Maximum Absolute Error                0.68         0.68        0.68 
     Divisor for ASE                    2848.00      2150.00     2158.00 
     Total Degrees of Freedom           1424.00          .           . 
     Misclassification Rate                0.32         0.32        0.33 
     Number of Estimated Weights           1.00          .           . 
     Sum of Frequencies                 1424.00      1075.00     1079.00 
     Sum Case Weights * Frequencies     2848.00      2150.00     2158.00 
      
     
      LEAF 
        ID      N      V N     % V 1     % V 0       % 1       % 0 
      
         1    1424    1075     31.81     68.19     31.74     68.26 
      
  • English rules

  • Sequence

  • Matrix

    Target information
    Name: BAD
    Label:
    Measurement: binary

    Tree settings


    Splitting criterion: Chi-Square Test
    Significance Level: 0.3
    Minimum number of observations in a leaf: 1
    Observations required for a split search: 16
    Maximum number of branches from a node: 4
    Maximum depth of tree: 6
    Splitting rules saved in each node: 5
    Surrogate rules saved in each node: 0
    Do not treat missing as an acceptable value
    Model assessment measure: Proportion Correctly Classified
    Subtree: Best assessment value
    Observations sufficient for split search: 1674
    Maximum tries in an exhaustive split search: 5000
    Do not use profit matrix during split search
    Do not use prior probability in split search
    P-value adjustment: KASS DEPTH
    Apply KASS BEFORE choosing number of branches

  • Log

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Filter Outliers

    Data sets:

    Automatic Filtering: none

  • Interval variables

  • Class Variables

  • Notes: not available


    Regression

  • Parameters:

  • Fit Statistics
     
     Fit Statistic                         Training      Validation            Test 
      
     Akaike's Information Criterion    1572.4632967               .               . 
     Average Squared Error             0.1960866071    0.1996476961    0.2120835677 
     Average Error Function            0.5680694614    0.5759982067    0.6101011668 
     Degrees of Freedom for Error              1373               .               . 
     Model Degrees of Freedom                     4               .               . 
     Total Degrees of Freedom                  1377               .               . 
     Divisor for ASE                           2754            2150            2158 
     Error Function                    1564.4632967    1238.3961444    1316.5983179 
     Final Prediction Error            0.1972291365               .               . 
     Maximum Absolute Error            0.9967935362    0.9628342136    0.9667716885 
     Mean Square Error                 0.1966578718    0.1996476961    0.2120835677 
     Sum of Frequencies                        1377            1075            1079 
     Number of Estimate Weights                   4               .               . 
     Root Average Sum of Squares       0.4428166744    0.4468195341    0.4605253171 
     Root Final Prediction Error        0.444104871               .               . 
     Root Mean Squared Error           0.4434612405    0.4468195341    0.4605253171 
     Schwarz's Bayesian Criterion      1593.3739467               .               . 
     Sum of Squared Errors             540.02251594    429.24254659    457.67633916 
     Sum of Case Weights Times Freq            2754            2150            2158 
     Misclassification Rate            0.3180827887     0.311627907    0.3382761816 
     Total Profit for BAD                       436             342             361 
     Average Profit for BAD            0.3166303558    0.3181395349    0.3345690454 
      

  • Target Information:

  • Regression Settings:

  • Output

  • Log

  • Training Code

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Dmneural

  • Fit Statistic
     
     Statistic    Label of Statistic                TRAINING1    TRAINING2    TRAINING3 
      
      _STAGE_     Training Stage                         0.00         1.00         2.00 
      _SSE_       Sum of Squared Errors                590.59       589.46       588.62 
      _RMSE_      Root Mean Squared Error                0.66         0.66         0.66 
      _ACCU_      Accuracy                              68.34        68.41        68.41 
      _AIC_       Akaike's Information Criterion     -1151.68     -1140.32     -1128.28 
      _SBC_       Schwarz's Bayesian Criterion       -1115.09     -1067.13     -1018.50 
      _PROF_      Total Profit                         436.00       436.00       436.00 
      _APROF_     Average  Profit                        0.32         0.32         0.32 
      _IC_        Investment Cost                        0.00         0.00         0.00 
      _ROI_       Return on Investment                   0.00         0.00         0.00 
      

  • Settings

  • Variables

  • Output

  • Log

  • Training Code

  • Score Code

    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)


    Transform Variables

  • Interval Variables and Transformations

  • Notes: not available


    Regression

  • Parameters:

  • Fit Statistics
     
     Fit Statistic                         Training      Validation            Test 
      
     Akaike's Information Criterion    1644.3102145               .               . 
     Average Squared Error             0.1946745954    0.1965702018    0.2063636757 
     Average Error Function            0.5647156652    0.5673141164    0.5925956925 
     Degrees of Freedom for Error              1406               .               . 
     Model Degrees of Freedom                    18               .               . 
     Total Degrees of Freedom                  1424               .               . 
     Divisor for ASE                           2848            2150            2158 
     Error Function                    1608.3102145    1219.7253502    1278.8215045 
     Final Prediction Error            0.1996591512               .               . 
     Maximum Absolute Error            0.9970106777    0.9587972272     0.956642827 
     Mean Square Error                 0.1971668733    0.1965702018    0.2063636757 
     Sum of Frequencies                        1424            1075            1079 
     Number of Estimate Weights                  18               .               . 
     Root Average Sum of Squares       0.4412194413     0.443362382    0.4542726887 
     Root Final Prediction Error       0.4468323525               .               . 
     Root Mean Squared Error           0.4440347659     0.443362382    0.4542726887 
     Schwarz's Bayesian Criterion      1739.0122662               .               . 
     Sum of Squared Errors             554.43324776    422.62593388     445.3328121 
     Sum of Case Weights Times Freq            2848            2150            2158 
     Misclassification Rate            0.3153089888    0.3172093023    0.3299351251 
     Total Profit for BAD                       452             342             361 
     Average Profit for BAD            0.3174157303    0.3181395349    0.3345690454 
      

  • Target Information:

  • Regression Settings:

  • Output

  • Log

  • Training Code

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Neural Network

    Optimization plot:

    Optimization

     
     Fit Statistic                     Training    Validation       Test 
      
     [ TARGET=BAD ]                         .            .           . 
     Average Profit                        0.27         0.27        0.29 
     Misclassification Rate                0.25         0.27        0.30 
     Average Error                         0.53         0.54        0.57 
     Average Squared Error                 0.18         0.18        0.20 
     Sum of Squared Errors               593.31       458.31      490.37 
     Root Average Squared Error            0.42         0.43        0.44 
     Root Final Prediction Error           0.45          .           . 
     Root Mean Squared Error               0.44         0.43        0.44 
     Error Function                     1769.56      1358.03     1440.46 
     Mean Squared Error                    0.19         0.18        0.20 
     Maximum Absolute Error                0.99         0.99        0.98 
     Final Prediction Error                0.20          .           . 
     Divisor for ASE                    3348.00      2510.00     2510.00 
     Model Degrees of Freedom            115.00          .           . 
     Degrees of Freedom for Error       1559.00          .           . 
     Total Degrees of Freedom           1674.00          .           . 
     Sum of Frequencies                 1674.00      1255.00     1255.00 
     Sum Case Weights * Frequencies     3348.00      2510.00     2510.00 
     Akaike's Information Criterion     1999.56          .           . 
     Schwarz's Baysian Criterion        2623.20          .           . 
      
  • Network settings

  • Variables

  • Output

  • Log

  • Training Code

  • Score Code

    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)


    Tree

    Model assessment plot:

    SAS Graphics

     
     Fit Statistic                     Training    Validation       Test 
      
     Average Squared Error                 0.22         0.22        0.22 
     Sum of Squared Errors               617.06       466.39      481.08 
     Root Average Squared Error            0.47         0.47        0.47 
     Maximum Absolute Error                0.68         0.68        0.68 
     Divisor for ASE                    2848.00      2150.00     2158.00 
     Total Degrees of Freedom           1424.00          .           . 
     Misclassification Rate                0.32         0.32        0.33 
     Number of Estimated Weights           1.00          .           . 
     Sum of Frequencies                 1424.00      1075.00     1079.00 
     Sum Case Weights * Frequencies     2848.00      2150.00     2158.00 
      
     
      LEAF 
        ID      N      V N     % V 1     % V 0       % 1       % 0 
      
         1    1424    1075     31.81     68.19     31.74     68.26 
      
  • English rules

  • Sequence

  • Matrix

    Target information
    Name: BAD
    Label:
    Measurement: binary

    Tree settings


    Splitting criterion: Chi-Square Test
    Significance Level: 0.2
    Minimum number of observations in a leaf: 1
    Observations required for a split search: 16
    Maximum number of branches from a node: 2
    Maximum depth of tree: 6
    Splitting rules saved in each node: 5
    Surrogate rules saved in each node: 0
    Treat missing as an acceptable value
    Model assessment measure: Proportion Correctly Classified
    Subtree: Best assessment value
    Observations sufficient for split search: 1674
    Maximum tries in an exhaustive split search: 5000
    Do not use profit matrix during split search
    Do not use prior probability in split search
    P-value adjustment: KASS DEPTH
    Apply KASS BEFORE choosing number of branches

  • Log

  • Score Code
    Model assessment settings
    Train data set is not selected for assessment.
    Validation data set is selected for assessment.
    Test data set is not selected for assessment.
    Scored data set: 5000 observations are saved for interactive model assessment.

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    Confusion Matrix (Assessed Partition=VALIDATION)

  • Notes: not available


    Assessment

    SAS Graphics

    SAS Graphics

    SAS Graphics

    SAS Graphics

    End Report