Friday, April 30, 2021

Validation of Machine Learning Algorithms and Scenarios - A short article

 

                        Validation of Machine Learning Codes

 *  It is a widely accepted fact that just having some examples in the form of datasets and machine learning algorithm at hand does not  assure that solving  a machine learning problem is possible or the results would provide any desired solution

 

*  For example ... if one wants a computer to distinguish a photo of a dog from   a photo of a cat , one can do it with good examples of dogs and cats . One can then train a dog versus Cat classifier which is based on some machine learning algorithms that would output the probability that a given photo is that of a dog or a cat . All of the times for a set of photos resembling a given photo , the output would be in the form of a validation quantity which would be expressing some level of accuracy for a number which would reflect how well the classifier algorithm was able to perform those computations and with what level of  alacrity and accuracy . I am using the alacrity which should convey to the reader about the performnace and speed aspect of the identification process of the Machine Learning algorithm when computed upon a batch of photos for finding resemblance over a batch of photos of classes of photos by doing all forms of stucturisation like segmentation and clustering , KNN etc . And when it comes to the factor of accuracy one can think of the degree and magnitude in terms of percentage of resemblance of the referenced sample to the sample over which the matching is to be calculated .

 

*     Based on the probability which is exressed in percentage accuracy , one can  then decide whether the class ( that is if a dog or a cat) is based on the estimated probability as calculated by the algorithm .

 

*  Whenever the obtained probability or percentage would be higher for a dog , one can minimize the risk of making a supposed wrong assessment by choosing the higher chances which would be favouring the probability of finding a dog .

 

*  The greater the probability difference between the likelihood of a dog against that of a cat , the higher would be the confidence that one can have in their choices of finding any appropriate result

 

*  And in case , the probability difference between the likelihood of a dog against that of a cat , here it can be assumed that the picture of the subject is not clear   or probably the subjects in the picture bear much resemblance in features which would indirectly mean that some of the pictures of the cats are similar to that of the dogs and because of which a confusion may arise and lead to another supposition that whether the dogs are cattish in the concerned pictures .

 

*   On the point of training a classifier :

When you pose a problem and offer the examples , with each of the examples being carefully marked with the label or class that the algorithm should learn ; then the computer trains the algorithm for a while and then finally one would get a resulting model out of the training process of the model over the dataset .

 

*  Here , your computer trains the algorithm for a while and finally one would get a resulting model for the answer which provides one with an answer or probability .

 

*  Labellling is an another associated activity that can be carried out but in the end a probability is just an opportunity to propose a solution and get an answer

 

*  At such a point , one may have addressed all the issues and perhaps might guess that the work is finished , but still one may validate the results for ensuring that the results generated are first comprehensible to the human , make sure   that the user is able to have a clear understanding of the involved background processes and break-up analysis of the code and the result which can enable other readers to understand the code along with numbers

 

*  More over this would be elaborated in the forthcoming sessions / articles where we will look into the various modes in which the machine learning results could   be validated and made comprehensible to the users

 

Last modified: 16:39

No comments:

Post a Comment