It is definitely recommended to load your ML models in apps.py. When the application runs here, the trained model is loaded only one time. Otherwise, the trained model is loaded whenever an endpoint is called, and the response time will therefore be slower.
Let’s assume that your trained model is ready to predict stock prices at a given time in the future. and is actually stored in a .joblib file.