Similarity model NLP
Project detail
Use Case: I have two files with multiple columns. We need to find best similar rows to match.
Requirements:
>Alphanumeric, symbols and abbreviations present. Apply necessary rules / regex / custom corpus to preprocess.(have a file with abbrevations)
>Match using similarity with filter on numerics then match with alphabetic words match.
>Set a threshold and above it give as best match below it give top 3 suggestions.
>Required minimum accuracy 70%
>Rule based or unsupervised models needed
>Bonus will be awarded if reinforcement learning or training model is added.
Attaching example data, do share your plan of action of solving it. I’ll share actual data if the plan seems right after adding NDA. We can also schedule a zoom call to discuss about it for better understanding.