Orca: A Distributed Serving System for Transformer-Based Generative Models

  • Is there any comparison of this with VLLm ?