M’interessa project is, in fact, an information retrieval solution, a binary classification product. So, information retrieval metrics must be used in order to assess the model of the project.
If the different document groups generated after a binary classification process are…
tp = true positive, relevant documents that are retrieved
tn = true negatives, not relevant documents that are not retrived
fp = false positives, not relevant documents that are retrieved
fn = false negatives, relevant documents that are not retrived
…the definitions of the the several information retrieval indicators used to assess that process are as follows:
Precision is the fraction of retrieved documents that are relevant. It takes a value between 0 and 1.
Precision = tp / (tp + fp)
Recall is the fraction of relevant documents retrieved from the total amount of relevant documents. It takes a value between 0 and 1.
Recall = tp / (tp + fn)
Accuracy is the proportion of true results (positive or negative) among the total of documents. It takes a value between 0 and 1.
Accuracy = (tp + tn) / (tp + tn + fp + fn)
F-measure (or F1 score)
This measure is a combination between Precision and Recall, an harmonic mean between both metrics. It takes a value between 0 and 1.
F = 2 x ((Precision x Recall) / (Precision + Recall))
F = 2tp / (2tp + fp + fn)
ROC (Receiver Operating Characteristic)
ROC is te plotting of the Recall vs Inverse Recall curve. The area under the curve is the value of this indicator, between 0 and 1.
As you can see in the Visualization tab of this project website, the parameters of the M’interessa model are directly related with all this assessment metrics. That dashboard has been generated from the data obtained from a grid search in order to see what is the best combination of parameters for the M’interessa model.
In the model used in the project we have set F1-score as main metric, but prioritizing Precision over Recall. So the best results are obtained with the combination of a Squared Loss Function, the algorithm of Neural Network with one layer and a Prediction Threshold of 0.75.
- Precision and recall. Wikipedia, the free encyclopedia. <https://en.wikipedia.org/wiki/Precision_and_recall>. [Retrieved on 12th July 2016].
- Receiver operating characteristic. Wikipedia, the free encyclopedia. <https://en.wikipedia.org/wiki/Receiver_operating_characteristic>. [Retrieved on 12th July 2016].