AWS Neuron implementation for Inf1 and Inf2 chips
Project detail
I am looking for someone with expertise in AWS technologies to help me implement AWS Neuron on my Inf1 and Inf2 chips. Specifically, I need assistance with setting up the product processors to do the implementation using an existing codebase.
You must be able to use Pytorch and AWS Neuron and combine existing models from Huggingface with some custom logic. In the end we should have a highly performant running model on an Inferentia1 chip for which we can expose a public endpoint.
This project must be done with the utmost accuracy, so please only reach out if you have demonstrable success with this technology. Thanks so much!