Please cite us if you use the software

Example-6 (Unbalanced data)

Environment check

Checking that the notebook is running on Google Colab or not.

import sys
try:
  import google.colab
  !{sys.executable} -m pip -q -q install pycm
except:
  pass

Binary classification for unbalanced data

from pycm import ConfusionMatrix

Case1 (Both classes have a good result)

$$Case_1=\begin{bmatrix}26900 & 40 \\25 & 500 \end{bmatrix}$$
case1 = ConfusionMatrix(matrix={"Class1": {"Class1": 26900, "Class2": 40}, "Class2": {"Class1": 25, "Class2": 500}})
case1.print_normalized_matrix()
print('ACC:', case1.ACC)
print('MCC:', case1.MCC)
print('CEN:', case1.CEN)
print('MCEN:', case1.MCEN)
print('DP:', case1.DP)
print('Kappa:', case1.Kappa)
print('RCI:', case1.RCI)
print('SOA1:', case1.SOA1)
Predict       Class1        Class2        
Actual
Class1        0.99852       0.00148       

Class2        0.04762       0.95238       


ACC: {'Class2': 0.9976333515383216, 'Class1': 0.9976333515383216}
MCC: {'Class2': 0.9378574017402594, 'Class1': 0.9378574017402594}
CEN: {'Class2': 0.30489006849060607, 'Class1': 0.012858728415908176}
MCEN: {'Class2': 0.46949279678726225, 'Class1': 0.023280122318969122}
DP: {'Class2': 2.276283896527635, 'Class1': 2.276283896527635}
Kappa: 0.9377606597584491
RCI: 0.8682877002417864
SOA1: Almost Perfect

Case2 (The first class has a good result)

$$Case_2=\begin{bmatrix}26900 & 40 \\500 & 25 \end{bmatrix}$$
case2 = ConfusionMatrix(matrix={"Class1": {"Class1": 29600, "Class2": 40}, "Class2": {"Class1": 500, "Class2": 25}})
case2.print_normalized_matrix()
print('ACC:', case2.ACC)
print('MCC:', case2.MCC)
print('CEN:', case2.CEN)
print('MCEN:', case2.MCEN)
print('DP:', case2.DP)
print('Kappa:', case2.Kappa)
print('RCI:', case2.RCI)
print('SOA1:', case2.SOA1)
Predict       Class1        Class2        
Actual
Class1        0.99865       0.00135       

Class2        0.95238       0.04762       


ACC: {'Class2': 0.982098458478369, 'Class1': 0.982098458478369}
MCC: {'Class2': 0.13048897476798949, 'Class1': 0.13048897476798949}
CEN: {'Class2': 0.4655917826576813, 'Class1': 0.06481573363174531}
MCEN: {'Class2': 0.4264929996758212, 'Class1': 0.11078640690031397}
DP: {'Class2': 0.864594924328404, 'Class1': 0.864594924328404}
Kappa: 0.08122239707598865
RCI: 0.022375346499017443
SOA1: Slight

Case3 (Second class has a good result )

$$Case_3=\begin{bmatrix}40 & 26900 \\25 & 500 \end{bmatrix}$$
case3 = ConfusionMatrix(matrix={"Class1": {"Class1": 40, "Class2": 26900}, "Class2": {"Class1": 25, "Class2": 500}})
case3.print_normalized_matrix()
print('ACC:', case3.ACC)
print('MCC:', case3.MCC)
print('CEN:', case3.CEN)
print('MCEN:', case3.MCEN)
print('DP:', case3.DP)
print('Kappa:', case3.Kappa)
print('RCI:', case3.RCI)
print('SOA1:', case3.SOA1)
Predict       Class1        Class2        
Actual
Class1        0.00148       0.99852       

Class2        0.04762       0.95238       


ACC: {'Class2': 0.019661387220098307, 'Class1': 0.019661387220098307}
MCC: {'Class2': -0.13000800945464058, 'Class1': -0.13000800945464058}
CEN: {'Class2': 0.06103563616795208, 'Class1': 0.014927427128936136}
MCEN: {'Class2': 0.03655796690365652, 'Class1': 0.01281422838054554}
DP: {'Class2': -0.8416930356875597, 'Class1': -0.8416930356875597}
Kappa: -0.0017678372492452412
RCI: 0.02192606003351106
SOA1: Poor

Multi-class classification for unbalanced data

Case1 (All classes have good result and are unbalanced)

$$Case_1=\begin{bmatrix}4 & 0 &0&1 \\0 & 4&1&0\\0&1&4&0\\0&0&1&4000 \end{bmatrix}$$
case1 = ConfusionMatrix(
    matrix={
        "Class1": {"Class1": 4, "Class2": 0, "Class3": 0, "Class4": 1},
        "Class2": {"Class1": 0, "Class2": 4, "Class3": 1, "Class4": 0},
        "Class3": {"Class1": 0, "Class2": 1, "Class3": 4, "Class4": 0},
        "Class4": {"Class1": 0, "Class2": 0, "Class3": 1, "Class4": 40000}})
case1.print_normalized_matrix()
print('ACC:', case1.ACC)
print('MCC:', case1.MCC)
print('CEN:', case1.CEN)
print('MCEN:', case1.MCEN)
print('DP:', case1.DP)
print('Kappa:', case1.Kappa)
print('RCI:', case1.RCI)
print('SOA1:', case1.SOA1)
Predict       Class1        Class2        Class3        Class4        
Actual
Class1        0.8           0.0           0.0           0.2           

Class2        0.0           0.8           0.2           0.0           

Class3        0.0           0.2           0.8           0.0           

Class4        0.0           0.0           2e-05         0.99998       


ACC: {'Class3': 0.9999250299880048, 'Class4': 0.9999500199920032, 'Class2': 0.9999500199920032, 'Class1': 0.9999750099960016}
MCC: {'Class3': 0.7302602381427055, 'Class4': 0.9333083339583177, 'Class2': 0.7999750068731099, 'Class1': 0.8944160139432883}
CEN: {'Class3': 0.3649884090288471, 'Class4': 0.0001575200922489127, 'Class2': 0.25701944178769376, 'Class1': 0.13625493172565745}
MCEN: {'Class3': 0.4654427710721536, 'Class4': 0.00029569133318617423, 'Class2': 0.3333333333333333, 'Class1': 0.17964888034078544}
DP: {'Class3': 2.7032690544190636, 'Class4': 3.1691421556058055, 'Class2': 2.869241573973406, 'Class1': 'None'}
Kappa: 0.8666333383326446
RCI: 0.8711441699127427
SOA1: Almost Perfect

Case2 (All classes have same result and are balanced)

$$Case_2=\begin{bmatrix}1 & 1 &1&1 \\1 & 1&1&1\\1&1&1&1\\1&1&1&1 \end{bmatrix}$$
case2 = ConfusionMatrix(
    matrix={
        "Class1": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class2": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class3": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class4": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1}})
case2.print_normalized_matrix()
print('ACC:', case2.ACC)
print('MCC:', case2.MCC)
print('CEN:', case2.CEN)
print('MCEN:', case2.MCEN)
print('DP:', case2.DP)
print('Kappa:', case2.Kappa)
print('RCI:', case2.RCI)
print('SOA1:', case2.SOA1)
Predict      Class1       Class2       Class3       Class4       
Actual
Class1       0.25         0.25         0.25         0.25         

Class2       0.25         0.25         0.25         0.25         

Class3       0.25         0.25         0.25         0.25         

Class4       0.25         0.25         0.25         0.25         


ACC: {'Class3': 0.625, 'Class4': 0.625, 'Class2': 0.625, 'Class1': 0.625}
MCC: {'Class3': 0.0, 'Class4': 0.0, 'Class2': 0.0, 'Class1': 0.0}
CEN: {'Class3': 0.8704188162777186, 'Class4': 0.8704188162777186, 'Class2': 0.8704188162777186, 'Class1': 0.8704188162777186}
MCEN: {'Class3': 0.9308855421443073, 'Class4': 0.9308855421443073, 'Class2': 0.9308855421443073, 'Class1': 0.9308855421443073}
DP: {'Class3': 0.0, 'Class4': 0.0, 'Class2': 0.0, 'Class1': 0.0}
Kappa: 0.0
RCI: 0.0
SOA1: Slight

Case3 (A class has a bad result and is a bit unbalanced)

$$Case_3=\begin{bmatrix}1 & 1 &1&1 \\1 & 1&1&1\\1&1&1&1\\10&1&1&1 \end{bmatrix}$$
case3 = ConfusionMatrix(
    matrix={
        "Class1": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class2": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class3": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class4": {"Class1": 10, "Class2": 1, "Class3": 1, "Class4": 1}})
case3.print_normalized_matrix()
print('ACC:', case3.ACC)
print('MCC:', case3.MCC)
print('CEN:', case3.CEN)
print('MCEN:', case3.MCEN)
print('DP:', case3.DP)
print('Kappa:', case3.Kappa)
print('RCI:', case3.RCI)
print('SOA1:', case3.SOA1)
Predict       Class1        Class2        Class3        Class4        
Actual
Class1        0.25          0.25          0.25          0.25          

Class2        0.25          0.25          0.25          0.25          

Class3        0.25          0.25          0.25          0.25          

Class4        0.76923       0.07692       0.07692       0.07692       


ACC: {'Class3': 0.76, 'Class4': 0.4, 'Class2': 0.76, 'Class1': 0.4}
MCC: {'Class3': 0.10714285714285714, 'Class4': -0.2358640882624316, 'Class2': 0.10714285714285714, 'Class1': -0.2358640882624316}
CEN: {'Class3': 0.8704188162777186, 'Class4': 0.6392779429225796, 'Class2': 0.8704188162777186, 'Class1': 0.6392779429225794}
MCEN: {'Class3': 0.9308855421443073, 'Class4': 0.647512271542988, 'Class2': 0.9308855421443073, 'Class1': 0.647512271542988}
DP: {'Class3': 0.16596653499824943, 'Class4': -0.3319330699964992, 'Class2': 0.16596653499824943, 'Class1': -0.33193306999649924}
Kappa: -0.07361963190184047
RCI: 0.11603030564493627
SOA1: Poor

Case4 (A class is very unbalaned and get bad result)

$$Case_4=\begin{bmatrix}1 & 1 &1&1 \\1 & 1&1&1\\1&1&1&1\\10000&1&1&1 \end{bmatrix}$$
case4 = ConfusionMatrix(
    matrix={
        "Class1": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class2": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class3": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class4": {"Class1": 10000, "Class2": 1, "Class3": 1, "Class4": 1}})
case3.print_normalized_matrix()
print('ACC:', case4.ACC)
print('MCC:', case4.MCC)
print('CEN:', case4.CEN)
print('MCEN:', case4.MCEN)
print('DP:', case4.DP)
print('Kappa:', case4.Kappa)
print('RCI:', case4.RCI)
print('SOA1:', case4.SOA1)
Predict       Class1        Class2        Class3        Class4        
Actual
Class1        0.25          0.25          0.25          0.25          

Class2        0.25          0.25          0.25          0.25          

Class3        0.25          0.25          0.25          0.25          

Class4        0.76923       0.07692       0.07692       0.07692       


ACC: {'Class3': 0.999400898652022, 'Class4': 0.000998502246630055, 'Class2': 0.999400898652022, 'Class1': 0.000998502246630055}
MCC: {'Class3': 0.24970032963739885, 'Class4': -0.43266656861311537, 'Class2': 0.24970032963739885, 'Class1': -0.43266656861311537}
CEN: {'Class3': 0.8704188162777186, 'Class4': 0.0029588592520426657, 'Class2': 0.8704188162777186, 'Class1': 0.0029588592520426657}
MCEN: {'Class3': 0.9308855421443073, 'Class4': 0.002903385725603509, 'Class2': 0.9308855421443073, 'Class1': 0.002903385725603509}
DP: {'Class3': 1.6794055876913858, 'Class4': -1.9423127303715728, 'Class2': 1.6794055876913858, 'Class1': -1.9423127303715728}
Kappa: -0.0003990813465900262
RCI: 0.5536610475678804
SOA1: Poor

Case5 (A class is very unbalaned and get bad result)

$$Case_5=\begin{bmatrix}1 & 1 &1&1 \\1 & 1&1&1\\1&1&1&1\\10&10&10&10 \end{bmatrix}$$
case5 = ConfusionMatrix(
    matrix={
        "Class1": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class2": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class3": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class4": {"Class1": 10, "Class2": 10, "Class3": 10, "Class4": 10}})
case5.print_normalized_matrix()
print('ACC:', case5.ACC)
print('MCC:', case5.MCC)
print('CEN:', case5.CEN)
print('MCEN:', case5.MCEN)
print('DP:', case5.DP)
print('Kappa:', case5.Kappa)
print('RCI:', case5.RCI)
print('SOA1:', case5.SOA1)
Predict      Class1       Class2       Class3       Class4       
Actual
Class1       0.25         0.25         0.25         0.25         

Class2       0.25         0.25         0.25         0.25         

Class3       0.25         0.25         0.25         0.25         

Class4       0.25         0.25         0.25         0.25         


ACC: {'Class3': 0.7115384615384616, 'Class4': 0.36538461538461536, 'Class2': 0.7115384615384616, 'Class1': 0.7115384615384616}
MCC: {'Class3': 0.0, 'Class4': 0.0, 'Class2': 0.0, 'Class1': 0.0}
CEN: {'Class3': 0.6392779429225794, 'Class4': 0.6522742127953861, 'Class2': 0.6392779429225794, 'Class1': 0.6392779429225794}
MCEN: {'Class3': 0.647512271542988, 'Class4': 0.7144082229288313, 'Class2': 0.647512271542988, 'Class1': 0.647512271542988}
DP: {'Class3': 0.0, 'Class4': 0.0, 'Class2': 0.0, 'Class1': 0.0}
Kappa: 0.0
RCI: 0.0
SOA1: Slight

Case6 (A class is very unbalaned and get bad result)

$$Case_6=\begin{bmatrix}1 & 1 &1&1 \\1 & 1&1&1\\1&1&1&1\\10000&10000&10000&10000 \end{bmatrix}$$
case6 = ConfusionMatrix(
    matrix={
        "Class1": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class2": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class3": {"Class1": 1, "Class2": 1, "Class3": 1, "Class4": 1},
        "Class4": {"Class1": 10000, "Class2": 10000, "Class3": 10000, "Class4": 10000}})
case6.print_normalized_matrix()
print('ACC:', case6.ACC)
print('MCC:', case6.MCC)
print('CEN:', case6.CEN)
print('MCEN:', case6.MCEN)
print('DP:', case6.DP)
print('Kappa:', case6.Kappa)
print('RCI:', case6.RCI)
print('SOA1:', case6.SOA1)
Predict      Class1       Class2       Class3       Class4       
Actual
Class1       0.25         0.25         0.25         0.25         

Class2       0.25         0.25         0.25         0.25         

Class3       0.25         0.25         0.25         0.25         

Class4       0.25         0.25         0.25         0.25         


ACC: {'Class3': 0.7499500149955014, 'Class4': 0.25014995501349596, 'Class2': 0.7499500149955014, 'Class1': 0.7499500149955014}
MCC: {'Class3': 0.0, 'Class4': 0.0, 'Class2': 0.0, 'Class1': 0.0}
CEN: {'Class3': 0.0029588592520426657, 'Class4': 0.539296694603886, 'Class2': 0.0029588592520426657, 'Class1': 0.0029588592520426657}
MCEN: {'Class3': 0.002903385725603509, 'Class4': 0.580710610324597, 'Class2': 0.002903385725603509, 'Class1': 0.002903385725603509}
DP: {'Class3': 0.0, 'Class4': 0.0, 'Class2': 0.0, 'Class1': 0.0}
Kappa: 0.0
RCI: 0.0
SOA1: Slight